Research on an Urban Building Area Extraction Method with High ‐ Resolution PolSAR Imaging Based on Adaptive Neighborhood Selection Neighborhoods for Preserving Embedding

: Feature extraction of an urban area is one of the most important directions of polarimetric synthetic aperture radar (PolSAR) applications. A high ‐ resolution PolSAR image has the characteristics of high dimensions and nonlinearity. Therefore, to find intrinsic features for target recognition, a building area extraction method for PolSAR images based on the Adaptive Neighborhoods selection Neighborhood Preserving Embedding (ANSNPE) algorithm is proposed. First, 52 features are extracted by using the Gray level co ‐ occurrence matrix (GLCM) and five polarization decomposition methods. The feature set is divided into 20 dimensions, 36 dimensions, and 52 dimensions. Next, the ANSNPE algorithm is applied to the training samples, and the projection matrix is obtained for the test image to extract the new features. Lastly, the Support Vector machine (SVM) classifier and post processing are used to extract the building area, and the accuracy is evaluated. Comparative experiments are conducted using Radarsat ‐ 2, and the results show that the ANSNPE algorithm could effectively extract the building area and that it had a better generalization ability; the projection matrix is obtained using the training data and could be directly applied to the new sample, and the building area extraction accuracy is above 80%. The combination of polarization and texture features provide a wealth of information that is more conducive to the extraction of building areas.


Introduction
The aggravated urbanization and expansion of cities reflect the impact of human activities on the natural environment. Research on urban land use using remote sensing can reflect the relationship among economic development, human activity, and the natural environment [1]. The Synthetic Aperture Radar (SAR) feature of all-weather detection compensates for the shortcomings of optical remote sensing [2]. The SAR has become an important means of remote sensing information extraction and plays an indispensable role in the field of earth observation. Traditional SAR image information is extracted by using the difference in the backscatter intensity of the target [3], but it is difficult to solve the problem of the same spectrum of foreign matter. With improvements in SAR image resolution, detailed information of the image is obvious, and texture features of the building area are more abundant and applied to the information extraction of a high-resolution SAR image. Zhao, GAO, and Kuang [4] used the variation function to calculate the and facial clustering, image indexing, and image classification [32][33][34][35][36][37][38][39][40][41][42][43][44][45][46]. Bao et al. presented the supervised NPE for feature extraction, using a class label to define the new distance to find the k nearest neighbors [43]. Watanabe K proposed that a variance in the NPE algorithm concerns the definition of the optimization problem [19]. Because of the uneven distribution of data, the neighborhood of the sample point is changed, and the NPE algorithm with fixed k value will have limitations.
To solve the above problems, an adaptive neighborhood selection method is introduced into the NPE algorithm, and a building extraction method of SAR, Adaptive Neighborhoods selection Neighborhood Preserving Embedding (ANSNPE) is proposed. The ANSNPE algorithm is applied to polarimetric SAR feature extraction, the SVM algorithm is used to classify the extracted features, and different extraction algorithms are compared [27,47]. Section 2 introduces PolSAR image features. Section 3 gives the ANSNPE algorithm and the framework of the extraction. Section 4 shows the experiments and results, and Section 5 gives the conclusion and future work.

PolSAR Image Features
PolSAR data describes the polarization characteristics of the ground target [48]. The rich characteristics of PolSAR can be used to extract a building. In this paper, features of the PolSAR image are divided into three categories: one is the features based on backscattering characteristics of the original image, the other is the texture features based on the statistical method, and the last one is features based on polarimetric target decomposition.

Backscattering Characteristics
In radar images, the echo intensity of objects reflects the gray change of objects. The backscattering coefficient of the SAR image is important information for radar echo. Therefore, the four band backscattering coefficients are extracted as the gray information of SAR images, shown in Table 1.

Texture Features
With an improvement in the resolution of SAR images, the spatial information and texture information of SAR images are more abundant. Texture information is important image information and is widely used in remote sensing applications. In this paper, the texture feature is extracted using the classic gray level co-occurrence matrix statistical method. The principle is to calculate the probability of a pair of pixels satisfying the distance of D in a certain direction in a certain window and then to generate the co-occurrence matrix. To describe texture features more intuitively using GLCM, Harakic et al. [49] carried out two statistical analyses on the basis of the co-occurrence matrix, and the typical texture parameters are shown in Table 2. i, j means the row and column of the pixel, d means spatial distance,θ means angle, K means the size of window,σ means the variance, and P means probability density.

Polarization Characteristics
Polarization information is unique information of synthetic aperture radar. The polarization target decomposition technique helps to reveal the scattering information of ground targets by using the scattering matrix, and polarization characteristics can be obtained using the polarization target decomposition theory.
Polarization decomposition is mainly divided into four categories [50]: the first is the two components decomposition method based on Kennaugh's matrix K (as in Huynen, Holm and Barnes, Yang); the second is the method of decomposing the covariance matrix C3 or coherent matrix T3 based on the scattering model (as in Freeman and Durden, Yamaguchi, Dong); the third is the feature vector or feature values analysis based on the covariance matrix C3 or the coherent matrix T3 (as in Cloude, Holm, vanZyl); and the fourth is the decomposition method based on the coherent scattering matrix S (as in Krogager, Puali, etc.). In this paper, five polarization decomposition algorithms are presented, as shown in Table 3. The decomposition of the Freeman-Durden based on the physical model of radar scattering echo is decomposed into three basic characteristics. Yamaguchi decomposition indicates the power of three kinds of scattering mechanisms: body scattering, surface scattering, and secondary scattering. Cloude decomposition provides entropy, average scattering angle and anisotropy features. The Pauli obtains a target generation factor based on T3 matrix decomposition. The Krogager decomposes the scattering matrix S into the sum of three specific physical meanings of the coherent components corresponding to the scattering of the ball, dihedral, and helix.

ANSNPE Algorithm
The Neighborhoods Preserving Embedding (NPE) algorithm is a linear approximation of the LLE algorithm aimed at preserving the local manifold structure of data. The premise is that in a local domain, a point can be represented linearly by the points around it. The objective is for the weight coefficients of the linear representation of the adjacent sample points in the original data space to remain consistent in the projected space [14]. Assuming that a training sample is represented by a high-dimensional feature set , … , ∈ , m is the characteristic number, which is the space dimension, and n is the sample number. The intrinsic characteristic of the sample is the low-dimensional manifold structure embedded in the m dimensional space. The low-dimensional feature of the output is represented as , … , ∈ . The major steps are as follows:  Finding the k nearest neighbors of the sample Xi, the affine reconstruction of Xi is performed by these neighborhood points. To minimize the reconstruction error, the optimized objective function is designed as the following Equation (1); In the NPE algorithm, any sample point is represented by the linear reconstruction of its k neighbor point, and the other points in the k nearest neighbor point can be reconstructed linearly by the k-1 points. If the value of k is selected reasonably [51], the linear reconstruction error will be very close. However, if the value of k is not reasonable, the linear reconstruction error will be larger. In practical applications, the distribution density of data is generally different, and the number of corresponding neighbor points should also be changed [52]. In the NPE algorithm, it is easy to have a large reconstruction error when the fixed value of k is set for every point. Therefore, an adaptive neighborhood selection neighborhood preserving embedding (ANSNPE) algorithm is proposed by introducing the adaptive neighborhood selection method.The algorithm is shown in Figure 1, and the major steps are as follows:  The initial neighbor parameter k, the minimum neighbor point parameter kmin, the maximum neighbor point parameter kmax, and the small event selection probability p are set. Finding the initial k nearest neighbors of samples Xi (Xi=[xij],j=1,…,k);  Selecting the k nearest to the neighbors adaptively. The mean Euclidean distance Di and the mean manifold distance Dm of the sample point Xi are calculated to obtain the parameter ki of sample Xi by Di and Dm (e.g. Equation (2), (3), (4)). If ki <k, it means that the Di is larger and the neighbor data of Xi is sparse; then, it is necessary to eliminate the larger (1-p) (k-ki) [53] Euclidean distance in the data set. If ki >k, it means that the Di is smaller and that the data are more dense. At the same time, it retains Xi as the neighborhood data, and the rest (1-p)(k-ki) of the Euclidean distance smaller points are selected to join the neighborhood Xi ;

Extraction Framework
The three categories of features and their combination can be written as three feature sets, which are F1={fi}i=1,…,20, F2={fi}i=1,…,36, F3={fi}i=1,…,52, and the procedure is shown in Figure 2. ANSNPE is applied to the three feature sets to extract new features. Then, new features are as an input of the SVM classier to obtain preliminary construction area extraction results. Finally, the final extracted construction area is obtained by the post processing of the preliminary results.

Experiments and Results
In this section, there are four experiments. One is that we discuss the selection of parameter d by using various d to obtain the extraction accuracy. In the second experiment of building extraction, the proposed method is compared with the original dimension-reduced method NPE, with linear dimensionality reduction principal component analysis (PCA), and with no dimensionality reduction. For the three data sets, the four approaches are applied. The overall accuracy (OA) are used to evaluate the performance of the different methods. In the third experiment, we discuss the applicability analysis of ANSNPE, which is the influence of selecting training samples. In the last experiment, we choose GF3 data as experiment data to demonstrate the applicability of the proposed method to different data sources.

Data
RADARSAT-2 and GF3 images of Suzhou are obtained from a subset of C-band, PolSAR data, which was acquired in 2017. Detailed information of RADARSAT and GF3 is listed in Table 4. Figure  3a shows the amplitude image of RADARSAT-2, Figure 3b shows the corresponding Google Earth image. The size of the image is 800×800.

Discussion of the Parameter d
The estimation of intrinsic dimensionality is a problem. There is no approach to confirm it (Tu et al., 2010). In this paper, parameter d is determined through experiments. Figure 3 shows the extraction accuracy under various choices of d. In Figure 4, blue refers to the detection rate (DR), red refers to the overall accuracy (OA), and orange refers to the false alarm rate (FAR). The range of d is from 2 to 20 for F1 and F2 and from 2 to 20 for F3. As shown in Figure 3, for the F1 dataset, OA has the best performance when d is in the value of 2 to 7. FAR keeps stable since d is 4. Moreover, the processing time is longer when the d is higher. Consequently, the parameter d for F1 is set as 4. For the F2 dataset, d in the value of 4 has the best performance. For the F3 dataset, d in the value of 8 has the best performance. Therefore, d is respectively set as 4, 4, and 8 in the experiments for F1, F2, and F3,

Experiments of Building Extraction
In this section, RADARSAT-2 image is used to evaluate the proposed algorithm. A total of 6400 training samples are selected from subset of RADARSAT-2 to extract features and obtain the project matrix. For comparative experiments, the other extracted methods use the same training samples. For the ANSNPE algorithm, the parameter k is set as 15, where kmin is 1, kmax is 30, and p is 0.3. Figures  5-7 shows the final building area extracted results of all methods. The extraction results are compared with the true values obtained from the visual interpretation of optical images. Table 5 gives the initial detection rate (IDR), the initial overall accuracy (IOA), the final detection rate (DR) and overall accuracy (OA) of the extracted results.
The  Table 5. It shows that the precision of the building area extraction by the ANSNPE algorithm is higher than that of the building area directly extracted by original features.

ANSNPE+SVM
NPE+SVM PCA+SVM SVM F1 F2 F3 Figure 5. Classification results of three feature sets. Yellow means incorrect building areas, which are extracted; green shows the properly extracted building areas; red shows the building areas not being extracted; and gray shows the non-built areas. Table 5. Extraction accuracy of the three feature sets. For the F1 dataset, the performance of building area extraction by the ANSNPE algorithm is much better than one of other algorithms. The building area is not extracted by the NPE algorithm. The other two algorithms extracted about 40 percent of building area, which is water area. It illustrates that the three algorithms fail to find and preserve the intrinsic pattern structure of the SAR image. In the experiment in the F2 dataset, the result of ANSNPE is better. More building area, especially the low building area, is not be extracted by the NPE and PCA algorithm. For the F3 dataset, the same phenomenon appears. The OA of SVM, which has no feature extraction, is higher than NPE and PCA, however, it is lower than the proposed ANSNPE algorithm.

Applicability Analysis
Because of the generalization of the ANSNPE algorithm, the selection of training samples may be affected by the results of test samples by generating features that cannot distinguish the building area. Therefore, depending on the type of ground objects, five training samples with different combinations of building areas, vegetation, and water are selected for study in Figure 6. The project matrix is applied to the test image, and the results are shown in Figure 7 and Table 6. The best results are in bold.The average accuracy of F1+ANSNPE+SVM is 71.42%, and the standard deviation is 12.41; the average accuracy of F2+ANSNPE+SVM is 78.82%, and the standard deviation is 5.11; and the average accuracy of F3+ANSNPE+SVM is 78.05%, and the standard deviation is 2.67.   Figure 7. Classification results of the three feature sets. Yellow shows the incorrect building areas, which are extracted; green shows the properly extracted building areas; red shows the building areas not being extracted; and gray shows the non-built areas.
There are some differences observed among the OA of F1+ANSNPE+SVM, F2+ANSNPE+SVM, and F3+ANSNPE+SVM. The OA of F1 is lower, and the reason for this result is that some water is falsely detected as the building area. F2 and F3 had a close OA, but the DR of F3 is more than 95% and is 8% more than that of F2. Some building areas are not detected in F2. Undetected building areas are mainly concentrated where buildings are relatively low, and the texture features are not obvious. In the three experiments, the error is mainly caused by the road construction area because the effect of the top displacement of high buildings makes the brightness extend to the road, and then the classifier mistakenly identifies the road as a building area.
In the applicability analysis, different training samples are applied to the test image, and the accuracy of building area extraction is different. The average OA of F1 is lower and fluctuated greatly; only the feature set composed of polarization features is used to extract the building area to obtain low accuracy, and the information it provided is not enough. In the F2 experiments, when the training sample is train 1, the OA is lowest. As a result, the building in train 1 is low, and the texture feature is not obvious. The extracted projection matrix is applied to the high-dimensional feature set composed of texture features, which cannot be used to extract good features for the building area. However, the average OA of F3 is not much different from that of F2, and the fluctuated value is lower.

GF3 Data
The proposed method is applied on the GF3 data. Figure 8 shows the backscattering image of GF3. Figure 9 and Table 7 give extracted results. In Figure 8, F3 has a better performance than F1 and F2. As can be seen from Table 7, the highest OA obtained by the proposed method with F3 and F1 is 88.32%. The highest DR is 74.14%, which is respectively about 4% and 8% higher than F1 and F2.

Data
HH HV VH VV  As shown in Figure 8, the top displacement of the building happens in the horizontal and vertical polarization images, which influences the results of F1. The non-building area is extracted as the building area. However, there are not false extracted building areas in the results of F2 at the same place. In addition, F3 can extract the building area, which is missed with F1 and F2. For GF3, all features, including the polarization feature and texture feature, are better chosen to extract building area.

Conclusion
PolSAR images have characteristics of high dimensions and nonlinearity. Embedding high-dimensional SAR data in high-dimensional space and describing the intrinsic geometric structure of SAR data can improve the accuracy of SAR information extraction. Therefore, it is of important theoretical and practical value to study the method of building area extraction from high-resolution PolSAR images. This paper analyzed the principle of the NPE algorithm and proposes a building area extraction method using high-resolution PolSAR images based on the ANSNPE algorithm. First, we used the gray engineering matrix and various polarization decomposition methods to make up the high-dimensional collection, consisting of 20, 36, and 52, respectively. Next, a low-dimensional projection matrix is obtained using the ANSNPE algorithm, and the high-dimensional features are reduced. Lastly, the SVM classification method is used to extract the building area and the detection rate, and the overall accuracy is calculated. Through a contrast test, it was found that the accuracy of building area extraction based on the ANSNPE algorithm is over 80%, which is higher than that obtained when using the original high-dimensional feature extraction. Through the applicability analysis, it is found that different training samples affect the accuracy of building area extraction and that polarization decomposition features provide rich information, complementary to texture features, and can extract the features that are beneficial to the extraction of building areas. Experiments on GF3 data verify that the proposed method is also applicable and that it is better to choose all features when building an area extraction with GF3. However, there are some non-extracted building areas. The next step will be to study how to reduce the accuracy of error extraction, considering the application of the better ANSNPE algorithm on the bigger area. Moreover, our aim is also to extract buildings more accurately and more rapidly.