Spectral Unmixing of Hyperspectral Remote Sensing Imagery via Preserving the Intrinsic Structure Invariant

Hyperspectral unmixing, which decomposes mixed pixels into endmembers and corresponding abundance maps of endmembers, has obtained much attention in recent decades. Most spectral unmixing algorithms based on non-negative matrix factorization (NMF) do not explore the intrinsic manifold structure of hyperspectral data space. Studies have proven image data is smooth along the intrinsic manifold structure. Thus, this paper explores the intrinsic manifold structure of hyperspectral data space and introduces manifold learning into NMF for spectral unmixing. Firstly, a novel projection equation is employed to model the intrinsic structure of hyperspectral image preserving spectral information and spatial information of hyperspectral image. Then, a graph regularizer which establishes a close link between hyperspectral image and abundance matrix is introduced in the proposed method to keep intrinsic structure invariant in spectral unmixing. In this way, decomposed abundance matrix is able to preserve the true abundance intrinsic structure, which leads to a more desired spectral unmixing performance. At last, the experimental results including the spectral angle distance and the root mean square error on synthetic and real hyperspectral data prove the superiority of the proposed method over the previous methods.


Introduction
Airborne and spaceborne hyperspectral remote sensing technology have made remarkable progress in the past two decades. Hyperspectral image is acquired by hyperspectral imager and is composed of pixels formed by tens to hundreds of wavebands in a narrow band (bandwidth less than 10 nm) from 300 nm to 2500 nm. Because of the high spectral resolution of the hyperspectral imagery, it can be used as a reference for identifying the ground object, so the hyperspectral imaging technique shows huge application prospects [1]. The basic unit of the hyperspectral imager that receives the ground signal is the pixel. Each pixel records an electromagnetic signal reflected by surface materials in the spot on the ground corresponding to the (one-pixel) instantaneous field of view (IFOV) of the hyperspectral imager, which is called spectral information. The spot may contain different ground objects. These ground objects have different spectral signals which are the basic components of spectral signal of the pixel. If one pixel contains only one ground object, the pixel is a pure pixel. If one pixel contains multiple ground objects, the pixel is a mixed pixel. Mixed pixels arise for one of two reasons. First, if the spatial resolution of the hyperspectral imager is low enough that adjacent ground objects can jointly occupy a single pixel. Second, mixed pixels appear when distinct materials are combined into a homogeneous mixture [2]. Due to the technical bottleneck in the design and manufacture of hyperspectral imagers, the spatial resolution of hyperspectral data The various research communities have proposed numerous methods for spectral unmixing. Based on the assumption that there is at least one pure pixel per endmember in hyperspectral images, many scholars have proposed corresponding algorithms, such as N-finder (N-FINDER) [5], vertex component analysis (VCA) [6], simplex growing algorithm (SGA) [7], and maximum volume by Householder Transformation (MVHT) [8]. In fact, due to the limited spatial resolution of the hyperspectral imager, there are a large number of mixed pixels in the hyperspectral remote sensing images. The resulting value of the spectral unmixing methods based on the assumption of pure pixels will have a large error. Therefore, some scholars have proposed some spectral unmixing methods without adopting pure pixels assumption, such as minimum volume simplex analysis (MVSA) [9], simplex identification via split augmented Lagrangian (SISAL) [10], and dependent component analysis (DECA) [11]. In addition, there are methods that can generate endmembers and abundance information at the same time, such as non-negative matrix factorization (NMF) [12] and minimum volume enclosing simplex (MVES) [13]. In the above methods, because NMF can generate endmember matrix and abundance matrix at the same time and is suitable for the extraction of mixed pixels, the research of NMF theory is the focus of many scholars. NMF theory is applied to spectral unmixing in the literature [14]. However, this method simply performs mathematical operations and lacks clear geographical significance [14]. To apply NMF theory to spectral unmixing, different scholars add different constraints to the standard non-negative matrix factorization objective function, making the mathematical model more in line with the actual geographical significance. They optimized the corresponding mathematical model solving method and achieved a certain spectral unmixing effect [15][16][17][18][19][20][21]. MVC-NMF [18] regards the minimum monomorphic volume formed by endmembers as a constraint. L1/2-NMF [20] with sparseness constraints is proposed to obtain the best sparse solution of endmember abundances.
Hyperspectral images are characterized by large amounts of data, high dimensions, and high band correlation. The different bands of hyperspectral images, especially the adjacent bands, are highly correlated, resulting in a certain degree of information redundancy [22]. Especially for hyperspectral classification, the number of bands is not positively correlated with the accuracy. On the contrary, when the number of bands reaches a certain limit, the overall classification accuracy The various research communities have proposed numerous methods for spectral unmixing. Based on the assumption that there is at least one pure pixel per endmember in hyperspectral images, many scholars have proposed corresponding algorithms, such as N-finder (N-FINDER) [5], vertex component analysis (VCA) [6], simplex growing algorithm (SGA) [7], and maximum volume by Householder Transformation (MVHT) [8]. In fact, due to the limited spatial resolution of the hyperspectral imager, there are a large number of mixed pixels in the hyperspectral remote sensing images. The resulting value of the spectral unmixing methods based on the assumption of pure pixels will have a large error. Therefore, some scholars have proposed some spectral unmixing methods without adopting pure pixels assumption, such as minimum volume simplex analysis (MVSA) [9], simplex identification via split augmented Lagrangian (SISAL) [10], and dependent component analysis (DECA) [11]. In addition, there are methods that can generate endmembers and abundance information at the same time, such as non-negative matrix factorization (NMF) [12] and minimum volume enclosing simplex (MVES) [13]. In the above methods, because NMF can generate endmember matrix and abundance matrix at the same time and is suitable for the extraction of mixed pixels, the research of NMF theory is the focus of many scholars. NMF theory is applied to spectral unmixing in the literature [14]. However, this method simply performs mathematical operations and lacks clear geographical significance [14]. To apply NMF theory to spectral unmixing, different scholars add different constraints to the standard non-negative matrix factorization objective function, making the mathematical model more in line with the actual geographical significance. They optimized the corresponding mathematical model solving method and achieved a certain spectral unmixing effect [15][16][17][18][19][20][21]. MVC-NMF [18] regards the minimum monomorphic volume formed by endmembers as a constraint. L 1/2 -NMF [20] with sparseness constraints is proposed to obtain the best sparse solution of endmember abundances.
Hyperspectral images are characterized by large amounts of data, high dimensions, and high band correlation. The different bands of hyperspectral images, especially the adjacent bands, are highly correlated, resulting in a certain degree of information redundancy [22]. Especially for hyperspectral classification, the number of bands is not positively correlated with the accuracy. On the contrary, when the number of bands reaches a certain limit, the overall classification accuracy will decrease, resulting in the so-called Hughes phenomenon [23]. It makes sense to reduce the dimensions of hyperspectral images. Manifold learning [24] is a nonlinear method of dimension reduction. Manifold learning (e.g., Laplacian Eigenmap [25]) is to find the low-dimensional manifold in the high-dimensional space and to find the corresponding embedded mapping. Researchers have shown that the image data cannot fill up uniformly the high-dimensional Euclidean space [25]. The image data can be viewed as data sampled from a low dimensional manifold embedded in a higher-dimensional space. These image data smoothly change along the geodesic of the data manifold [25]. All above-mentioned spectral unmixing algorithms only consider the Euclidean structure of the hyperspectral data space. In fact, hyperspectral data are more likely lie on low-dimensional manifold [26]. Inspired by manifold learning, we transform the hyperspectral image data projection into a low-dimensional space to model its intrinsic structure, thereby realizing the hyperspectral image dimension reduction to facilitate spectral unmixing. It is known that hyperspectral image not only contains abundant spectral information of the ground objects, but also includes spatial distribution of the ground objects. Most of the above mentioned methods treat hyperspectral images as a collection of spectral vectors and neglect the possible spatial correlations between pixels. Yet, in the literature [27], a prior of spatial correlations between the different pixels of the hyperspectral image is utilized in spectral unmixing algorithm which leads to improve spectral unmixing accuracy. Weighted non-negative matrix factorization designs appropriate weights integrating the spatial information in local neighborhood to enhance spectral unmixing [28]. Inspired by this, the spatial information of the hyperspectral image is preserved when modeling the intrinsic structure of the hyperspectral image. Endmember spectral variability is an inevitable phenomenon in hyperspectral imaging and is a source of error in spectral unmixing accuracy [29]. A hierarchical weighted sparsity unmixing (HWSU) method improves spectral unmixing accuracy by decreasing the influence of the endmember spectral variability [30]. In modeling the intrinsic structure of hyperspectral image, it is necessary to consider reducing the influence of spectral endmember variability to improve the accuracy of hyperspectral unmixing. Meanwhile, we attempt to establish the close link between hyperspectral image and abundance matrix and construct the graph regularization to preserve intrinsic structure invariant in hyperspectral unmixing. In addition, sparse constraint that reduces the risk of getting stuck in local minima in non-convex minimization computations will be introduced into the proposed method in the paper, which enhances the sparsity of endmember abundances.
Finally, a novel algorithm which can preserve intrinsic structure invariant in hyperspectral unmixing named PISINMF is proposed for hyperspectral unmixing. PISINMF improves two aspects to improve spectral unmixing accuracy. Firstly, L 1/2 regularizer with regularization parameter defined in the literature [31] is adopted as sparsity constraint for improving the sparsity of endmember abundances. Secondly, the intrinsic structure of hyperspectral image is modeled, which preserves spectral information and spatial information of hyperspectral images. The above mentioned intrinsic structure, as a priori knowledge, is constructed with graph regularization that makes the intrinsic structure of the decomposed abundance information consistent with the intrinsic structure of the true abundance information. PISINMF is experimented on synthetic hyperspectral data and real hyperspectral data to verify its validity.
The rest of the paper is organized as follows: the second part introduces the proposed method of PISNMF, the third part introduces the experimental results and analysis of synthetic data and real data, the fourth part introduces discussion and the fifth part introduces the conclusions.

The Proposed PISINMF Method
Mixed pixels are typically modeled using a linear mixture model or a nonlinear mixture model. As a rule of thumb, the linear mixturemodel is associated with mixtures for which the pixel components are homogeneous surfaces in spatially segregated patterns. On the contrary, the nonlinear mixture model takes the intimate association or interaction with more than one component into account [30]. The linear mixture model is simpler than the nonlinear model, and the linear model's calculation results meet the accuracy requirements. The linear mixture model can explain the formation of hyperspectral image and conform to the actual statistical laws. In this paper, we will introduce the proposed algorithm model based on the linear mixture model. The schematic overview of the proposed method is shown in Figure 2. introduce the proposed algorithm model based on the linear mixture model. The schematic overview of the proposed method is shown in Figure 2.

Linear Mixture Model
The linear mixture model is based on the following three assumptions. Firstly, a finite number of endmembers within each IFOV linearly contribute pixel spectral signals according to their coverage percentage (abundance). Secondly, the ground objects are homogeneous surfaces in spatially segregated patterns. Thirdly, the electromagnetic energy of neighboring ground objects within each IFOV does not affect each other [32][33][34].
Based on the above assumptions, the linear mixture model of the l-band hyperspectral image containing n pixels with m endmembers can be expressed as follows: where = [ , … , ] ∈ ℝ × represents the hyperspectral image ( represents the ith pixel regarded as l-band spectral vector, l denotes band number, n denotes number of hyperspectral image pixels); = [ , … , ] ∈ ℝ × represents the endmember matrix ( represents the ith endmember signature); = [ , … , ] ∈ ℝ × denotes abundance matrix; denotes an error matrix meaning the noise of hyperspectral image. In general, is a known hyperspectral image. Endmember matrix and abundance matrix are the solution targets. denotes the proportion of the area occupied by the ith endmember in the jth pixel. According to the linear mixture model assumption, the abundance matrix needs to satisfy requirements that each element within it is non-negative and the sum over the columns of are unity. obeys the following constraints expressed in Equation (2): ≥ 0, ∀ ,

Modeling Intrinsic Structure of Hyperspectral Images
The proposed method transforms the hyperspectral image data projection into a lowdimensional space to model its intrinsic structure preserving spectral information and spatial information of hyperspectral images, thereby realizing the low-dimensional representation of the hyperspectral image to facilitate spectral unmixing.
The first law of geography is expressed as "Everything is related to everything else, but near things are more related than distant things" [35]. With that in mind, we divide the hyperspectral image into multiple local windows, considering only the relation between the central pixel and the surrounding pixels in the local window, ignoring the pixels outside the local window. Figure 3 shows

Linear Mixture Model
The linear mixture model is based on the following three assumptions. Firstly, a finite number of endmembers within each IFOV linearly contribute pixel spectral signals according to their coverage percentage (abundance). Secondly, the ground objects are homogeneous surfaces in spatially segregated patterns. Thirdly, the electromagnetic energy of neighboring ground objects within each IFOV does not affect each other [32][33][34].
Based on the above assumptions, the linear mixture model of the l-band hyperspectral image containing n pixels with m endmembers can be expressed as follows: where X = [x 1 , . . . , x n ] ∈ R l×n represents the hyperspectral image (x i represents the ith pixel regarded as l-band spectral vector, l denotes band number, n denotes number of hyperspectral image pixels); G = [g 1 , . . . , g m ] ∈ R l×m represents the endmember matrix (g i represents the ith endmember signature); A = [a 1 , . . . , a n ] ∈ R m×n denotes abundance matrix; ε denotes an error matrix meaning the noise of hyperspectral image. In general, X is a known hyperspectral image. Endmember matrix G and abundance matrix A are the solution targets. A ij denotes the proportion of the area occupied by the ith endmember in the jth pixel. According to the linear mixture model assumption, the abundance matrix A needs to satisfy requirements that each element within it is non-negative and the sum over the columns of A are unity. A ij obeys the following constraints expressed in Equation (2):

Modeling Intrinsic Structure of Hyperspectral Images
The proposed method transforms the hyperspectral image data projection into a low-dimensional space to model its intrinsic structure preserving spectral information and spatial information of hyperspectral images, thereby realizing the low-dimensional representation of the hyperspectral image to facilitate spectral unmixing.
The first law of geography is expressed as "Everything is related to everything else, but near things are more related than distant things" [35]. With that in mind, we divide the hyperspectral image into multiple local windows, considering only the relation between the central pixel and the surrounding pixels in the local window, ignoring the pixels outside the local window. Figure 3 shows the local window of hyperspectral image. Assume that the hyperspectral image with n pixels, equals to r × s pixels, is arranged on r × s grids. The spatial coordinate of the top left corner grid is (0, 0). The grid spatial coordinate is increased from left to right, from top to bottom until the coordinate of the bottom right corner grid is (r,s). In Figure 3, a black grid and red grids constitute a local window in which the black grid represents central pixel and red grids represent surrounding pixels. The size of the local window can be expressed by the product of the number of grids at the outermost edge of the local window. For example, the size of the local window in Figure 3 is 5 × 5. x i and x j are the hyperspectral image pixels which represent spectral signals. Let us consider the case in which pixel x i is the central pixel of a local window. If the pixel x j is a surrounding pixel of pixel x i in the local window, we define j ∈ N(i). If the pixel x j does not belong to surrounding pixels of pixel x i in the local window, we define j / ∈ N(i). the local window of hyperspectral image. Assume that the hyperspectral image with n pixels, equals to r × s pixels, is arranged on r × s grids. The spatial coordinate of the top left corner grid is (0, 0). The grid spatial coordinate is increased from left to right, from top to bottom until the coordinate of the bottom right corner grid is (r,s). In Figure 3, a black grid and red grids constitute a local window in which the black grid represents central pixel and red grids represent surrounding pixels. The size of the local window can be expressed by the product of the number of grids at the outermost edge of the local window. For example, the size of the local window in Figure 3 is 5 × 5. and are the hyperspectral image pixels which represent spectral signals. Let us consider the case in which pixel is the central pixel of a local window. If the pixel is a surrounding pixel of pixel in the local window, we define ∈ ( ). If the pixel does not belong to surrounding pixels of pixel in the local window, we define ∉ ( ). Then we model the intrinsic structure of the local window of the hyperspectral image. and are connected by an edge if the pixel is a surrounding pixel of pixel in the local window. The edges can be weighted by the heat kernel [25]. The heat kernel is expressed as follows: where σ is a scaling parameter of the heat kernel. , are the ith and jth column vectors of the hyperspectral image . It is known from Equation (3) that when the pixel is in the local window, it is connected to the central pixel , and the strength of the connection can be weighed by the heat kernel. When the pixel is outside the local window, its connection to the central pixel does not exist. This is consistent with the first law of geography. The σ value adapted to different intrinsic structures of local windows is given by the following equation: where h is the number of surrounding pixels of the central pixel in the local window. A hyperspectral image not only contains abundant spectral information of ground objects, but also includes spatial correlations between each pixel. Utilization of spatial correlations between pixels will be incorporated in modeling the intrinsic structure of hyperspectral image. Inspired by the above principle, the closer the surrounding pixel is to the spatial distance of the central pixel , the more the surrounding pixel contributes to the central pixel . It is also obvious that the closer the spectral similarity between the surrounding pixel and the central pixel , the more the Then we model the intrinsic structure of the local window of the hyperspectral image. x i and x j are connected by an edge if the pixel x j is a surrounding pixel of pixel x i in the local window. The edges can be weighted by the heat kernel [25]. The heat kernel is expressed as follows: where σ is a scaling parameter of the heat kernel. x i ,x j are the ith and jth column vectors of the hyperspectral image X. It is known from Equation (3) that when the pixel x j is in the local window, it is connected to the central pixel x i , and the strength of the connection can be weighed by the heat kernel. When the pixel x j is outside the local window, its connection to the central pixel does not exist. This is consistent with the first law of geography. The σ value adapted to different intrinsic structures of local windows is given by the following equation: where h is the number of surrounding pixels of the central pixel x i in the local window. A hyperspectral image not only contains abundant spectral information of ground objects, but also includes spatial correlations between each pixel. Utilization of spatial correlations between pixels will be incorporated in modeling the intrinsic structure of hyperspectral image. Inspired by the above principle, the closer the surrounding pixel x j is to the spatial distance of the central pixel x i , the more the surrounding pixel x j contributes to the central pixel x i . It is also obvious that the closer the spectral similarity between the surrounding pixel x j and the central pixel x i , the more the surrounding pixel x j contributes to the central pixel x i . Assume that (p, q) and (b, c) are the spatial coordinates of the central pixel x i and the surrounding pixel x j , respectively. Then x i and x j also can be expressed as x(p, q) and x(b, c). Accounting for the effects of spectral information and spatial information, how much the surrounding pixel x j contributes to the central pixel x i can be reflected by Equations (5)- (7): where < x(p, q), x(b, c)> denotes the inner product of the two spectra, and ||·|| denotes the vector magnitude. µ value characterizes spatial distance difference between the central pixel x i and the surrounding pixel x j . The smaller the value of µ, the closer the surrounding pixel x j is located with respect to the central pixel x i and the more contribution pixel x j has to the central pixel x i . ν represents the spectral angle distance between x i and x j . Spectral angle distance reflects the difference in the geometric characteristics of the two spectral vectors. If the ν value is smaller, the more similar the geometric feature of the surrounding pixel x j is to the central pixel x i and the more contribution the pixel x j has to the central pixel x i . M i,j is a simplified representation of the degree of the contribution. Equation (5) is convenient to calculate and can reflect the effect of the surrounding pixel x j on the central pixel x i . Accounting for the effect of the surrounding pixel x j on the central pixel x i , we promote Equations (3)- (8): If the value of W i,j is larger, the surrounding pixel x j is more closely related to the central pixel x i . Compared with Equation (3), Equation (8) better reveals spectral geometric differences, spatial distance information, and spectral similarity of pixel pairs in the local window.
The spectral information of the pixel reflects the physical and chemical characteristics of the ground objects in the hyperspectral image. However, the spectral curve of the same ground object will change under different environments, which is called endmember spectral variability. Figure 4 shows the spectral curves of trees and rocks. As can be seen from Figure 4, curve 1 of the blue line is the spectral curve of trees under weak sunlight, curve 2 of orange line is the spectral curve of trees under strong sunlight, and curve 3 of gray line is the spectral curve of rocks. Because of the change in the intensity of solar radiation, the spectral curve of the tree mutates. The spatial position of the three ground objects of Figure 4 is as shown in Figure 5. It is assumed that the ground object of the central pixel is tree1 whose spectral information is curve 1. The ground objects of the two surrounding pixels are tree2 and rock whose spectral curves are curve 2 and curve 3, respectively.
We use Equations (3) and (8) to project the relation between the central pixel and the surrounding pixels into the low-dimensional space. The resulting projection values are shown in Table 1. After projecting by Equation (3), projection value of the relation between tree1 and tree2 is relatively equal to projection value of the relation between tree1 and rock. According to the projection results of Equation (3), we think that the effect of the two surrounding pixels on the central pixel is nearly the same, and that the ground objects of the two surrounding pixels are close, which is obviously contrary to the actual situation. After projecting by Equation (8), projection value of the relation between tree1 and tree2 is much larger than projection value of the relation between tree1 and rock. According to the projection results of Equation (8), compared with the surrounding pixel including the curve 3, we determine that the surrounding pixel containing the curve 2 is closer to the central pixel, which is in accordance with the real situation.   Hyperspectral image can be segmented into homogeneous and transition areas [27,28,36]. Pixels in homogeneous areas have large values relatively close to each other [28]. Pixels in transition areas have relatively small values compared with homogeneous pixels [28]. Segmenting the homogeneous and transition areas of hyperspectral images is conductive to improve spectral unmixing accuracy [27,28]. For each central pixel with coordinate (p, q), the whole relations with its all surrounding pixels in the local window can be calculated in the Equation (9).
, can be used to calculate the strength of the central pixel connection with surrounding pixel . The whole relations between the central pixel with coordinate (p, q) and its all surrounding pixels in the local window can be denoted as ( , ) or ( ): Given hyperspectral image with r × s pixels, the whole relations of all local windows of  1  7  13  19  25  31  37  43  49  55  61  67  73  79  85  91  97  103  109  115  121  127  133  139 145 151

Reflectance Band Number
Spectral curve 1 of the tree under weak sunlight Spectral curve 2 of the tree under strong sunlight Spectral curve 3 of rock   Hyperspectral image can be segmented into homogeneous and transition areas [27,28,36]. Pixels in homogeneous areas have large values relatively close to each other [28]. Pixels in transition areas have relatively small values compared with homogeneous pixels [28]. Segmenting the homogeneous and transition areas of hyperspectral images is conductive to improve spectral unmixing accuracy [27,28]. For each central pixel with coordinate (p, q), the whole relations with its all surrounding pixels in the local window can be calculated in the Equation (9).
, can be used to calculate the strength of the central pixel connection with surrounding pixel . The whole relations between the central pixel with coordinate (p, q) and its all surrounding pixels in the local window can be denoted as ( , ) or ( ): Given hyperspectral image with r × s pixels, the whole relations of all local windows of hyperspectral image can be described as R matrix in the Equation (10). Figure 6 shows simulated

Projection Equation (3) Equation (8)
Projection value of the relation between tree1 and tree2 0.0079 0.1640 Projection value of the relation between tree1 and rock 0.0069 0.0220 Hyperspectral image can be segmented into homogeneous and transition areas [27,28,36]. Pixels in homogeneous areas have large values relatively close to each other [28]. Pixels in transition areas have relatively small values compared with homogeneous pixels [28]. Segmenting the homogeneous and transition areas of hyperspectral images is conductive to improve spectral unmixing accuracy [27,28]. For each central pixel x i with coordinate (p, q), the whole relations with its all surrounding pixels in the local window can be calculated in the Equation (9). W i,j can be used to calculate the strength of the central pixel x i connection with surrounding pixel x j . The whole relations between the central pixel x i with coordinate (p, q) and its all surrounding pixels in the local window can be denoted as ϕ(p, q) or ϕ(i): Given hyperspectral image with r × s pixels, the whole relations of all local windows of hyperspectral image can be described as R matrix in the Equation (10). Figure 6 shows simulated hyperspectral image and whole relations of all local windows of simulated spectral image according to Equation (10). We see that Figure 6b inherits the spatial distribution characteristics of the simulated hyperspectral image. In Figure 6b, yellow represents a larger value and blue represents a smaller value. According to the definition of homogeneous and transition areas, the areas of dark yellow and light yellow are homogeneous areas while the blue areas are transition areas. Therefore, R matrix can segment the homogeneous and transition areas of hyperspectral image: Based on the above analysis, we choose Equation (8) as the projection equation, which will project the hyperspectral image into the low dimensional space. We determine each pixel of the hyperspectral image as the central pixel to create a local window with size of 5 × 5. Then we use the projection Equation (8) to project the relation between the central pixel and the surrounding pixels into the low-dimensional space to model the intrinsic structure of hyperspectral images. value. According to the definition of homogeneous and transition areas, the areas of dark yellow and light yellow are homogeneous areas while the blue areas are transition areas. Therefore, R matrix can segment the homogeneous and transition areas of hyperspectral image: Based on the above analysis, we choose Equation (8) as the projection equation, which will project the hyperspectral image into the low dimensional space. We determine each pixel of the hyperspectral image as the central pixel to create a local window with size of 5 × 5. Then we use the projection Equation (8) to project the relation between the central pixel and the surrounding pixels into the low-dimensional space to model the intrinsic structure of hyperspectral images.

PISINMF Algorithm Model
Spectral unmixing is designed to extract endmember spectrum and corresponding abundance from hyperspectral image data, in accordance to the linear mixture model expressed by Equation (1). NMF has been introduced into spectral unmixing, which aims at obtaining endmember matrix and abundance matrix to approximately represent the given non-negative matrix . In order to measure the degree of approximation, the loss function based on the square of the Euclidean distance between and is expressed as: where the operator ‖. ‖ represents the Frobenius norm. Unfortunately, because of the non-convexity of NMF, the algorithm has a large number of local minima. In order to reduce the influence of the non-convexity of NMF on spectral unmixing accuracy, sparsity constraints are introduced into standard NMF [14,20]. In [20], L1/2 regularizer has been proven

PISINMF Algorithm Model
Spectral unmixing is designed to extract endmember spectrum and corresponding abundance from hyperspectral image data, in accordance to the linear mixture model expressed by Equation (1). NMF has been introduced into spectral unmixing, which aims at obtaining endmember matrix G and abundance matrix A to approximately represent the given non-negative matrix X. In order to measure the degree of approximation, the loss function based on the square of the Euclidean distance between X and GA is expressed as: min where the operator . F represents the Frobenius norm. Unfortunately, because of the non-convexity of NMF, the algorithm has a large number of local minima. In order to reduce the influence of the non-convexity of NMF on spectral unmixing accuracy, sparsity constraints are introduced into standard NMF [14,20]. In [20], L 1/2 regularizer has been proven to provide the best sparse solution. Therefore, L 1/2 regularizer is adopted as sparsity constraint for improving the sparsity of endmember abundances. Thus, NMF with sparsity constraints is written as: where λ is the regularization parameter which weights the contribution of A 1/2 . A 1/2 is defined in the Equation (13): Motivated by a temperature schedule in the simulated annealing technique, the regularization parameter λ is defined in the Equation (14) to avoid getting stuck in local minima. The regularization parameter λ shows its effectiveness in [31]. Thus, it will be adopted as the regularization parameter that controls the impact of the sparsity measure function A 1/2 : where α 0 and τ are constants to regularize the impact of sparsity constraints and t is the iteration number in the process of optimization [31].
Given the hyperspectral data {x i } n i=1 , the abundance data {a i } n i=1 , and the endmember matrix G, the following relation exists in Equation (15): x i ∈ R l is an individual pixel of hyperspectral image X. a i ∈ R m is abundance data corresponding to endmembers. From the perspective of dimensionality reduction, the endmember matrix can be regarded as a new base matrix in the m-dimension space. The hyperspectral image is represented as a linear combination of the new base matrix. a i can be regarded as the representation of x i in m-dimension space. Equation (15) can be further promoted to Equation (16): We learn from Equation (16) that the more similar x i and x j are, the more similar the abundance of given pixel x i is to the abundance of given pixel x j . Therefore, a close link between x i and a i reveals a close link between hyperspectral image and decomposed abundance matrix. Based on the above analysis, once x i and x j are close, the abundance a i of pixel x i and the abundance a j of pixel x j in the m-dimension space are close too. Inspired by the above analysis, a graph regularizer is introduced in PISINMF to keep local structure invariant between hyperspectral image and decomposed abundance matrix. Graph regularizer is written as: where Tr(.) represents the trace of matrix, (·) T represents the transpose of matrix, W is weight matrix in which W i,j is element, D is a diagonal matrix and D ii = ∑ j∈N(i) To make PISINMF have the ability to preserve the intrinsic structure invariant, the regularization term of Equation (17) obtained in the previous section is incorporated into the Equation (12). The cost function of PISINMF is written as: where µ ≥ 0 is the regularization parameter of the graph regularizer.

The Update Rules for PISINMF
The projection gradient learning [14] method following the standard gradient learning is adopted to obtain the iterative rule for PISINMF. To make G and A non-negative, we use the function max{0, x} to set the negative components to zero while leaving the non-negative components unchanged. Based on the iterative rule, the non-negative matrix X is decomposed to obtain G and A. If the result of the k-th iteration is G (k) and A (k) , the iterative rule is written as: where α (k) and β (k) denote the learning step, ∇ G f (G, A) and ∇ A f (G, A) are the first-order derivatives of the function f (G, A) expressed in Equations (21) and (22): If α and β are equal to some small positive numbers, the equations are equivalent to conventional gradient descent. When setting α = G/ GAA T and β = A/G T GA, the iterative rule of G and A could be turned into the multiplicative update rule [27]. The multiplicative rules are written as: where . * and ./ represent the multiplication of elements and the division of elements within the matrix, respectively.

Implementation Issue
The multiplicative rules of PISINMF algorithm model do not consider the sum over the columns of abundance to be unity. To make the results of PISINMF algorithm model more accurate, X and G are replaced by X and G in Equation (25): where 1 n (1 m ) is a n (m)-dimensional column vector of all 1 s. δ in Equation (25) is the weight value and can control the influence of the sum over the columns of abundance to be unity on the objective function. In the implementation, a relatively small value of δ (i.e., δ ≤ 40) will cause the sum of each column of A to be much smaller than one. On the contrary, a larger value of δ (i.e., δ ≥ 60) will cause the sum of each column of A to be closer to one, but it reduces the convergence rate. So δ is selected as 50 to meet the needs of precision and efficiency in PISINMF model. In general, the initial value of the endmember matrix G can be calculated by the endmember extraction algorithm or randomly chosen data. In PISINMF algorithm model, we use the VCA algorithm or other endmember extraction algorithms to set the initial value of endmember matrix G.
The initial value of the abundance matrix A can be calculated using the least squares method expressed as following Equation (26): To make the initial value of the abundance matrix A non-negative, we force the matrix A to be non-negative according to Equation (27): The PISINMF algorithm model sets two stop criteria in the iterative optimization process. One of the stop criteria is the maximum number of iterations, another stop criteria is that a threshold τ should be specified for the Equation (28):

The Procedure of PISINMF
The procedure of PISINMF is summarized as follows: Step 1.
Determine the endmember number m; initialize the endmember matrix G by VCA algorithm or other endmember extraction algorithms for synthetic hyperspectral data and real hyperspectral data; initialize the abundance matrix A using Equations (26) and (27); Step 2.
Replace matrices G and X with matrices G and X according to Equation (25); Step 4.
Replace matrices G and X with matrices G and X; Step 6.
Repeat step 2-step 5 until any one of the stop criteria is satisfied.

Experimental Results and Analysis
The paper uses synthetic hyperspectral data and real hyperspectral images to verify and analyze the algorithm model. The experiment is to verify the effectiveness of the PISINMF algorithm and to compare the performance with classical algorithms.
Spectral angle distance (SAD) is often used to calculate the degree of approximation between the estimated endmember spectrum and the true endmember spectrum. SAD is defined as follows: where g i is the estimated ith endmember signature andĝ i is the ith endmember signature. The smaller calculated SAD value, the closer the extracted spectral endmember is to the true spectrum. The root mean square error (RMSE) of the entire image measures the similarity between the estimated abundance and the true abundance. RMSE is written as: where a i is a column vector of the estimated abundance A andâ i is a column vector of the true abundance. If the calculated RMSE value is smaller, the accuracy of the decomposed abundance matrix can be considered to be closer to the true abundance. The parameters of the PISINMF algorithm model are as follows: (1) the maximum number of iterations is set to 1000 in all the synthetic hyperspectral data and real hyperspectral data experiments; (2) regularization parameter µ in PISINMF algorithm model is experimentally selected between 0.005 n/m 2 and 0.05 n/m 2 for all the synthetic hyperspectral data and real hyperspectral data experiments, with which PISINMF algorithm model can reach a satisfactory performance; (3) threshold τ should be specified as 0.001; (4) α 0 and τ are selected as 0.1 and 25, respectively; (5) The size of the local window is selected as 5 × 5 in PISINMF.

Synthetic Hyperspectral Data
Five endmember signatures (ammonioalunite, calcite, kaolinite, jarosite, muscovite) have been extracted from the USGS Spectral Library [37]. As shown in the Figure 7, the endmember signatures have 420 spectral bands. The five endmember signatures are mixed to form corresponding abundance matrix according to the Dirichlet distribution, ensuring that abundance is non-negative and the sum of each column of the abundance matrix is unity. Mixing coefficients are used in order to make some pixels mixed in higher degree. If the abundance of a pixel is larger than mixing coefficient, the pixel is replaced with a mixture made up of all endmembers of equal abundances. Different experimental environmental conditions, such as different signal-to-noise ratios (SNR), different abundance mixing coefficients, and different spatial dimensions are set to test the unmixing ability of the PISINMF algorithm. In the same environment the unmixing ability of the PISINMF algorithm and other algorithms is compared. The results of the comparison are evaluated using SAD and RMSE standards. To evaluate anti-noise ability, zero-mean white Gaussian noise is added to the synthetic hyperspectral data. SNR is defined as: where X and ε represent the observation and noise of pixels, respectively. E[·] denotes the expectation operator. standards. To evaluate anti-noise ability, zero-mean white Gaussian noise is added to the synthetic hyperspectral data. SNR is defined as: where and represent the observation and noise of pixels, respectively. E[·] denotes the expectation operator. Firstly, the synthetic hyperspectral data experiments verify that the PISINMF algorithm has the ability to preserve intrinsic structure invariant. The size of the synthetic hyperspectral data is 30 × 30, the abundance mixing coefficient is 0.8, and the signal-to-noise ratio is 30 db, respectively. PISINMF, L1/2-NMF, VCA are used to obtain the estimated abundance from the synthetic hyperspectral data. Then, the estimated abundance and the true abundance are projected onto the low-dimensional space using the Equation (8). In order to compare the ability of three methods (PISINMF, L1/2-NMF, VCA) Firstly, the synthetic hyperspectral data experiments verify that the PISINMF algorithm has the ability to preserve intrinsic structure invariant. The size of the synthetic hyperspectral data is 30 × 30, the abundance mixing coefficient is 0.8, and the signal-to-noise ratio is 30 db, respectively. PISINMF, L 1/2 -NMF, VCA are used to obtain the estimated abundance from the synthetic hyperspectral data. Then, the estimated abundance and the true abundance are projected onto the low-dimensional space using the Equation (8). In order to compare the ability of three methods (PISINMF, L 1/2 -NMF, VCA) to preserve the intrinsic structure, Equation (32) is used to compare the difference between the projection value of the estimated abundance and the projection value of the true abundance. If the error value is smaller, the ability to preserve the intrinsic structure is stronger: where P i stands for the projection value of the estimated abundance andP i stands for the projection value of the true abundance.
The error values of the three methods are recorded in Table 2. The difference between the intrinsic structure of the true abundance and the intrinsic structure of the estimated abundance is shown in Figure 8. To more visually show the difference, we accumulate the error between the central pixel of Figure 8 and its surrounding pixels in the local window. Sum of difference values based on the central pixels in the local window is shown in Figure 9. In Table 2, NORMp of PISINMF is smaller than the other two methods (L 1/2 -NMF, VCA). As can be seen from Figures 8 and 9, compared to the other two methods (L 1/2 -NMF, VCA), the intrinsic structure of the estimated abundance obtained by PISINMF is closer to the intrinsic structure of true abundance.  structure of the estimated abundance (a) The difference between the intrinsic structure of the true abundance and the estimated abundance of VCA; (b) The difference between the intrinsic structure of the true abundance and the estimated abundance of L12-NMF; (c) The difference between the intrinsic structure of the true abundance and the estimated abundance of PISINMF.  Central pixels of hyperspectral image VCA L12NMF PISINMF  The synthetic hyperspectral data experiments verify the anti-noise ability of the PISINMF algorithm model. We test the anti-noise ability of PISINMF, L 1/2 -NMF and VCA under different SNR conditions. The number of pixels in the synthetic hyperspectral data is 2401, the abundance mixing coefficient is 0.8, and the signal-to-noise ratios are 15, 30, 45, 60, ∞ db, respectively. Performance is evaluated using SAD and RMSE standards. The accuracy of the extracted endmember spectra is evaluated using the average value of SAD, and the accuracy of the obtained abundance is evaluated by RMSE. Table 3 provides SAD values for each method and Table 4 provides RMSE values for each method under different SNR conditions. Figure 10 shows the plots of the experimental results with different SNR. By comparison, PISINMF has smaller RMSE and SAD values than the other two algorithms (L 1/2 -NMF, VCA) under different SNR conditions. Particularly when SNR = 15 dB, the SAD and RMSE values of PISINMF are obviously superior to that of L 1/2 -NMF and VCA. coefficient is 0.8, and the signal-to-noise ratios are 15, 30, 45, 60, ∞ db, respectively. Performance is evaluated using SAD and RMSE standards. The accuracy of the extracted endmember spectra is evaluated using the average value of SAD, and the accuracy of the obtained abundance is evaluated by RMSE. Table 3 provides SAD values for each method and Table 4 provides RMSE values for each method under different SNR conditions. Figure 10 shows     Experiments test the PISINMF algorithm's unmixing ability under different abundance mixing coefficients. Therefore, the synthetic hyperspectral dataset has a signal-to-noise ratio of 35 db, a number of pixels of 2401, and abundance mixing coefficients of 0.6, 0.7, 0.8, and 0.9, respectively. Table 5 provides SAD values for each method and Table 6 provides RMSE values for each method under different abundance mixing coefficients conditions. Figure 11 shows the plots of the experimental results with different abundance mixing coefficients. As can be seen from Figure 11, SAD value and RMSE value of the three methods show a downward trend as abundance mixing coefficients increase. By comparison, RMSE value and SAD value of the PISINMF model algorithm are smaller than those of the other two algorithms (L 1/2 -NMF, VCA) under different abundance mixing coefficients.  Experiments test the algorithm's unmixing ability under different spatial dimensions conditions. Spatial dimension here is referred to the number of mixed pixels. The synthetic hyperspectral dataset has a signal-to-noise ratio of 35 db, an abundance mixing coefficient of 0.8, and the number of pixels are set to 900, 2500, 4900, and 8100, respectively. The accuracy of endmember spectra is evaluated using the mean value of SAD and the accuracy of the abundance is evaluated using RMSE. Table 7 provides SAD values for each method and Table 8 provides RMSE values for each method under different spatial dimensions conditions. Figure 12 shows the plots of the experimental results with different numbers of pixels. In Figure 12, we can see that the SAD value and RMSE value of PISINMF show a relatively stable trend with the increase of the number of pixels, indicating that the number of pixels has limited influence on the precision of spectral unmixing. By comparison, PISINMF has  Experiments test the algorithm's unmixing ability under different spatial dimensions conditions. Spatial dimension here is referred to the number of mixed pixels. The synthetic hyperspectral dataset has a signal-to-noise ratio of 35 db, an abundance mixing coefficient of 0.8, and the number of pixels are set to 900, 2500, 4900, and 8100, respectively. The accuracy of endmember spectra is evaluated using the mean value of SAD and the accuracy of the abundance is evaluated using RMSE. Table 7 provides SAD values for each method and Table 8 provides RMSE values for each method under different spatial dimensions conditions. Figure 12 shows the plots of the experimental results with different numbers of pixels. In Figure 12, we can see that the SAD value and RMSE value of PISINMF show a relatively stable trend with the increase of the number of pixels, indicating that the number of pixels has limited influence on the precision of spectral unmixing. By comparison, PISINMF has smaller RMSE and SAD values than the other two algorithms (L 1/2 -NMF, VCA) under the conditions of different numbers of mixed pixels, indicating that the PISINMF algorithm is suitable for different number of pixels.

Real Hyperspectral Data
In this section, there are two real hyperspectral data used to verify the algorithm. The first real hyperspectral dataset is Samson hyperspectral image which covers the spectral range of 400 nm-900 nm with a band width of 3.2 nm. The hyperspectral dataset has been available on the internet [38]. This data is owned by Oregon State University and is provided by WeoGeo. The hyperspectral dataset was acquired by the SAMSON instrument, a push-broom, visible to near IR, hyperspectral sensor. This dataset has been atmospherically corrected using TAFKAA, a hyperspectral atmospheric correction algorithm. This data is in units of remote sensing reflectance. As the original image with 952 × 952 pixels is too large, a subimage with 95 × 95 pixels has been extracted from the original image [39]. A region of 95 × 95 pixels is shown in Figure 13.
By visual observation, the image is mainly covered by water, tree and rock. So the number of the endmembers is selected as 3. The reference endmember spectra are obtained according to the literature [30]. Some pixels corresponding to the three ground objects are randomly selected in Figure  13 by visual observation. The number of pixels selected for each ground object is 30. We calculate the

Real Hyperspectral Data
In this section, there are two real hyperspectral data used to verify the algorithm. The first real hyperspectral dataset is Samson hyperspectral image which covers the spectral range of 400 nm-900 nm with a band width of 3.2 nm. The hyperspectral dataset has been available on the internet [38]. This data is owned by Oregon State University and is provided by WeoGeo. The hyperspectral dataset was acquired by the SAMSON instrument, a push-broom, visible to near IR, hyperspectral sensor. This dataset has been atmospherically corrected using TAFKAA, a hyperspectral atmospheric correction algorithm. This data is in units of remote sensing reflectance. As the original image with 952 × 952 pixels is too large, a subimage with 95 × 95 pixels has been extracted from the original image [39]. A region of 95 × 95 pixels is shown in Figure 13.
By visual observation, the image is mainly covered by water, tree and rock. So the number of the endmembers is selected as 3. The reference endmember spectra are obtained according to the literature [30]. Some pixels corresponding to the three ground objects are randomly selected in Figure 13 by visual observation. The number of pixels selected for each ground object is 30. We calculate the average value of pixels corresponding to each ground object as the reference endmember signature. Table 9 provides SAD values for each method (PISINMF, L 1/2 -NMF, MVC-NMF, MVES), and the best result is denoted by bold font. As can be seen from Table 9, PISINMF has the highest number of the best-performance cases. The abundance maps for L 1/2 -NMF, MVC-NMF, MVES and PISINMF are shown in Figure 14.     Figure 15 displays a comparison of the endmember spectra of the PISINMF algorithm with the reference spectra. The color map value of the abundance map is shown in Figure 14. The color of the abundance map corresponds to the value 0-1. The closer the color is to the blue, the closer the abundance value is to 0. This indicates that there is no such ground object distribution in the area. The closer the color is to the yellow, the closer the abundance value is to 1, indicating that more ground objects are distributed in the area. Compared with the abundance maps of other methods, we can see that the abundance map of PISINMF is closer to the distribution of real ground objects. The water, rock and tree abundance maps of the PISINMF algorithm are basically consistent with the original map. However, in the water abundance maps of the MVES and MVCNMF algorithms, some areas of the water abundance map are misjudged. Although the abundance maps of L 1/2 -NMF are similar to PISINMF, the abundance maximum value of PISINMF is relatively close to 1 while the abundance maximum value of L 1/2 -NMF is about 0.7. According to the meaning of the color map of abundance map, the abundance information of PISINMF is closer to the real situation by visual observation.
The section will analyze the second real hyperspectral data about the Cuprite mineral area in the western part of Nevada, USA, which is a mineralogical site that has been established as a reference site for hyperspectral and other remote sensing instruments [40][41][42]. This image was obtained by an Airborne Visible-Near/Infrared Imaging Spectrometer (AVIRIS) sensor on 19 June 1997. Cuprite data has been widely used to validate spectral unmixing algorithm. The real spectral library of ground objects in this area is available online [37]. Experiments will use the PISINMF algorithm to extract endmembers and abundance maps corresponding to endmembers from the image. The experiment uses the USGS Spectral Library as reference standard and uses SAD to measure the accuracy of the extracted endmember spectra. The subimage of 250 × 190 pixels is taken from the real hyperspectral data for experimentation. The subimage has been atmospherically corrected [6]. Figure 16 shows the 80th band of the subimage. In experiments, the low SNR bands and the water vapor absorption bands (1-6, 104-113, 148-167, and 219-224) have been removed [20]. The number of endmembers in the selected region is estimated to be 12 using the VD method [43] with false alarm probability P f = 10 −5 . The endmember signatures extracted by the PISINMF algorithm and other algorithms are compared with the USGS library, and spectral similarities are quantified using the SAD standard. In the results, kaolin is divided into two endmembers due to the endmember spectral variability.      Figure 17. Table 10 provides SAD values for each method (PISINMF, L 1/2 -NMF, MVC-NMF, MVES), and the best result is denoted by bold font. As can be seen from Table 10, the PISINMF algorithm has the highest number of endmember minimum SAD values, and the PISINMF algorithm has the smallest endmember SAD mean value. So the PISINMF algorithm is superior to other algorithms (L 1/2 -NMF, MVC-NMF and MVES).
water, rock and tree abundance maps of the PISINMF algorithm are basically consistent with the original map. However, in the water abundance maps of the MVES and MVCNMF algorithms, some areas of the water abundance map are misjudged. Although the abundance maps of L1/2-NMF are similar to PISINMF, the abundance maximum value of PISINMF is relatively close to 1 while the abundance maximum value of L1/2-NMF is about 0.7. According to the meaning of the color map of abundance map, the abundance information of PISINMF is closer to the real situation by visual observation. The section will analyze the second real hyperspectral data about the Cuprite mineral area in the western part of Nevada, USA, which is a mineralogical site that has been established as a reference site for hyperspectral and other remote sensing instruments [40][41][42]. This image was obtained by an Airborne Visible-Near/Infrared Imaging Spectrometer (AVIRIS) sensor on 19 June 1997. Cuprite data has been widely used to validate spectral unmixing algorithm. The real spectral library of ground objects in this area is available online [37]. Experiments will use the PISINMF algorithm to extract endmembers and abundance maps corresponding to endmembers from the image. The experiment uses the USGS Spectral Library as reference standard and uses SAD to measure the accuracy of the extracted endmember spectra. The subimage of 250 × 190 pixels is taken from the real hyperspectral data for experimentation. The subimage has been atmospherically corrected [6]. Figure 16 shows the 80th band of the subimage. In experiments, the low SNR bands and the water vapor absorption bands (1-6, 104-113, 148-167, and 219-224) have been removed [20]. The number of endmembers in the selected region is estimated to be 12 using the VD method [43] with false alarm probability = 10 . The endmember signatures extracted by the PISINMF algorithm and other algorithms are compared with the USGS library, and spectral similarities are quantified using the SAD standard. In the results, kaolin is divided into two endmembers due to the endmember spectral variability.   Figure 17. Table 10 provides SAD values for each method (PISINMF, L1/2-NMF, MVC-NMF, MVES), and the best result is denoted by bold font. As can be seen from Table  10, the PISINMF algorithm has the highest number of endmember minimum SAD values, and the PISINMF algorithm has the smallest endmember SAD mean value. So the PISINMF algorithm is superior to other algorithms (L1/2-NMF, MVC-NMF and MVES).   The endmember signatures extracted by the PISINMF algorithm and other algorithms are compared with the USGS library, and spectral similarities are quantified using the SAD standard. In the results, kaolin is divided into two endmembers due to the endmember spectral variability.   Figure 17. Table 10 provides SAD values for each method (PISINMF, L1/2-NMF, MVC-NMF, MVES), and the best result is denoted by bold font. As can be seen from Table  10, the PISINMF algorithm has the highest number of endmember minimum SAD values, and the PISINMF algorithm has the smallest endmember SAD mean value. So the PISINMF algorithm is superior to other algorithms (L1/2-NMF, MVC-NMF and MVES).

Discussion
In this study, we investigate the validity of preserving intrinsic structure invariant in hyperspectral unmixing to improve spectral unmixing accuracy. The experimental results of synthetic data and real data prove that the proposed PISINMF algorithm is superior to the typical algorithms (i.e., MVC-NMF, L1/2-NMF, MVES, VCA). However, some issues still need to be resolved or improved for further research.
First, since the true ground object distribution is unknown, the intrinsic structure of the true ground object distribution cannot be modeled. However, we can make full use of hyperspectral imagery as a priori knowledge and model its intrinsic structure. Equation (8) which utilizes spatial information and spectral information of hyperspectral image is chosen as projection equation to model the intrinsic structure of hyperspectral image. If the spatial distance between the surrounding pixel and the central pixel is farther, the weaker the relation between them is. The relation between them is mapped by Equation (8), and the projection value is small. If the surrounding pixels are closer to the central pixel, the stronger the relation between them is. The relation between them is mapped by Equation (8), and the projection value is large. The result value of Equation (8) can reflect the distance information between the surrounding pixels and the central pixel to some extent. Meanwhile, R matrix based on Equation (8) can segment the homogeneous and transition areas of hyperspectral images. Segmentation image inherits the spatial distribution characteristics of the original hyperspectral image. Moreover, compared with Equation (3), Equation (8) utilizing spectral information better reveal the real situation of the distribution of ground objects, which has been explained in the previous examples. As mentioned earlier, graph regularizer which establishes a close link between hyperspectral image and decomposed abundance matrix is introduced in PISINMF algorithm model to keep intrinsic structure invariant between hyperspectral image and decomposed abundance matrix. Synthetic hyperspectral data experiments prove that graph regularizer can promote the intrinsic structure of the estimated abundance obtained by PISINMF closer to the intrinsic structure of true abundance.
Second, in the PISINMF algorithm model, how to choose the size of the local window is considered. Different sizes of local windows affect the spectral unmixing accuracy. In Figure 18, as the size of the local window becomes larger, the spectral unmixing accuracy is improved. But choosing a large size of the local window consumes more computation. Meanwhile, a large size of the local window means that the local window contains surrounding pixels that are farther away from the central pixel. According to Equation (8), if the spatial distance of the surrounding pixel from the central pixel is further, its effect on the central pixel is smaller. So there is no need to choose a large size of the local window. It is reasonable to select the size of the local window as 5 × 5 in PISINMF algorithm model.

Discussion
In this study, we investigate the validity of preserving intrinsic structure invariant in hyperspectral unmixing to improve spectral unmixing accuracy. The experimental results of synthetic data and real data prove that the proposed PISINMF algorithm is superior to the typical algorithms (i.e., MVC-NMF, L 1/2 -NMF, MVES, VCA). However, some issues still need to be resolved or improved for further research.
First, since the true ground object distribution is unknown, the intrinsic structure of the true ground object distribution cannot be modeled. However, we can make full use of hyperspectral imagery as a priori knowledge and model its intrinsic structure. Equation (8) which utilizes spatial information and spectral information of hyperspectral image is chosen as projection equation to model the intrinsic structure of hyperspectral image. If the spatial distance between the surrounding pixel and the central pixel is farther, the weaker the relation between them is. The relation between them is mapped by Equation (8), and the projection value is small. If the surrounding pixels are closer to the central pixel, the stronger the relation between them is. The relation between them is mapped by Equation (8), and the projection value is large. The result value of Equation (8) can reflect the distance information between the surrounding pixels and the central pixel to some extent. Meanwhile, R matrix based on Equation (8) can segment the homogeneous and transition areas of hyperspectral images. Segmentation image inherits the spatial distribution characteristics of the original hyperspectral image. Moreover, compared with Equation (3), Equation (8) utilizing spectral information better reveal the real situation of the distribution of ground objects, which has been explained in the previous examples.
As mentioned earlier, graph regularizer which establishes a close link between hyperspectral image and decomposed abundance matrix is introduced in PISINMF algorithm model to keep intrinsic structure invariant between hyperspectral image and decomposed abundance matrix. Synthetic hyperspectral data experiments prove that graph regularizer can promote the intrinsic structure of the estimated abundance obtained by PISINMF closer to the intrinsic structure of true abundance.
Second, in the PISINMF algorithm model, how to choose the size of the local window is considered. Different sizes of local windows affect the spectral unmixing accuracy. In Figure 18, as the size of the local window becomes larger, the spectral unmixing accuracy is improved. But choosing a large size of the local window consumes more computation. Meanwhile, a large size of the local window means that the local window contains surrounding pixels that are farther away from the central pixel. According to Equation (8), if the spatial distance of the surrounding pixel from the central pixel is further, its effect on the central pixel is smaller. So there is no need to choose a large size of the local window. It is reasonable to select the size of the local window as 5 × 5 in PISINMF algorithm model. Third, L1/2 regularizer as sparsity constraints has been introduced into the PISINMF algorithm model to improve the sparsity of endmember abundances. How to select the regularization parameter to control the impact of the sparsity constraints is an open theoretical issue. In [44], the regularization parameter is dependent on the sparsity criteria, which is widely adopted in other spectral unmixing methods [20]. The optimization method mentioned in [20] can only guarantee that the sparse solution converges to the local minimum. However, due to the non-convexity of the non-negative matrix function, there are multiple local minima. To avoid getting stuck in local minima, we refer to the research [31] to define the regularization parameter in the paper. Additionally, the experimental results show that it is superior to traditional sparsity criteria.

Conclusions
The PISINMF model, which can preserve intrinsic structure invariant in hyperspectral unmixing, is proposed in the paper. In the PISINMF model, a novel projection equation which utilizes spatial information and spectral information of hyperspectral image is adopted to model the intrinsic structure of hyperspectral image data. Compared with the heat kernel, a novel projection equation can better reveal the real situation of the distribution of ground objects. Graph regularizer establishes a close link between hyperspectral image and abundance matrix and is introduced in PISINMF algorithm model to keep intrinsic structure invariant. Compared with VCA and L1/2-NMF, the experiments of synthetic hyperspectral data show that the intrinsic structure of the estimated abundance obtained by PISINMF is closer to the intrinsic structure of true abundance. Besides, sparse constraints that reduces the risk of getting stuck in local minima in non-convex minimization computations is introduced into the PISINMF model, which enhances the sparsity of endmember abundances. The PISINMF model is compared with several classical hyperspectral unmixing models, including VCA, L1/2-NMF, MVC-NMF and MVES. The experimental results of synthetic  Third, L 1/2 regularizer as sparsity constraints has been introduced into the PISINMF algorithm model to improve the sparsity of endmember abundances. How to select the regularization parameter λ to control the impact of the sparsity constraints is an open theoretical issue. In [44], the regularization parameter λ is dependent on the sparsity criteria, which is widely adopted in other spectral unmixing methods [20]. The optimization method mentioned in [20] can only guarantee that the sparse solution converges to the local minimum. However, due to the non-convexity of the non-negative matrix function, there are multiple local minima. To avoid getting stuck in local minima, we refer to the research [31] to define the regularization parameter λ in the paper. Additionally, the experimental results show that it is superior to traditional sparsity criteria.

Conclusions
The PISINMF model, which can preserve intrinsic structure invariant in hyperspectral unmixing, is proposed in the paper. In the PISINMF model, a novel projection equation which utilizes spatial information and spectral information of hyperspectral image is adopted to model the intrinsic structure of hyperspectral image data. Compared with the heat kernel, a novel projection equation can better reveal the real situation of the distribution of ground objects. Graph regularizer establishes a close link between hyperspectral image and abundance matrix and is introduced in PISINMF algorithm model to keep intrinsic structure invariant. Compared with VCA and L 1/2 -NMF, the experiments of synthetic hyperspectral data show that the intrinsic structure of the estimated abundance obtained by PISINMF is closer to the intrinsic structure of true abundance. Besides, sparse constraints that reduces the risk of getting stuck in local minima in non-convex minimization computations is introduced into the PISINMF model, which enhances the sparsity of endmember abundances. The PISINMF model is compared with several classical hyperspectral unmixing models, including VCA, L 1/2 -NMF, MVC-NMF and MVES. The experimental results of synthetic hyperspectral data with different SNR levels, mixing coefficients and pixel numbers prove that the PISINMF algorithm model is superior to other classical methods using SAD and RMSE standards. Two real hyperspectral datasets are experimented to verify the effectiveness of the PISINMF model. Experimental results on two real hyperspectral datasets demonstrate that the PISINMF model outperforms other several classical algorithms. Specifically, the overall SAD accuracy of the PISINMF model is approximately 9-22% higher than that of the L 1/2 -NMF model.