Integrating Aerial LiDAR and Very-High-Resolution Images for Urban Functional Zone Mapping

: This study presents a new approach for Urban Functional Zone (UFZ) mapping by integrating two-dimensional (2D) Urban Structure Parameters (USPs), three-dimensional (3D) USPs, and the spatial patterns of land covers, which can be divided into two steps. Firstly, we extracted various features, i.e., spectral, textural, geometrical features, and 3D USPs from very-high-resolution (VHR) images and light detection and ranging (LiDAR) point clouds. In addition, the multi-classiﬁers (MLCs), i.e., Random Forest, K-Nearest Neighbor, and Linear Discriminant Analysis classiﬁers were used to perform the land cover mapping by using the optimized features. Secondly, based on the land cover classiﬁcation results, we extracted 2D and 3D USPs for different land covers and used MLCs to classify UFZs. Results for the northern part of Brooklyn, New York, USA, show that the approach yielded an excellent accuracy of UFZ mapping with an overall accuracy of 91.9%. Moreover, we have demonstrated that 3D USPs could considerably improve the classiﬁcation accuracies of UFZs and land covers by 6.4% and 3.0%, respectively.


Introduction
Urban Functional Zones (UFZs, acronyms used throughout the manuscript are listed in Supplementary Material Table S1) refer to different functional divisions of urban lands, e.g., commercial, residential, industrial, and park zones [1]. Different UFZs often feature different architectural environments and are composed of various land covers. However, previous studies pay much attention to land cover mapping instead of large-scale UFZ classification [1]. As the basic spatial unit in cities, UFZs are vital for the urban planner and managers to conduct urban-related applications, e.g., the investigation of land surface temperatures, landscape patterns, urban planning, and urban ecological modeling [2][3][4][5]. Therefore, the detection of UFZs is a basis for urban management and provides a better understanding of urban spatial structures [6,7].
Very-High-Resolution (VHR) images represent urban surfaces with good spatial details, capturing tiny differences in spectral and textural records, thus can be utilized for UFZ mapping [8][9][10]. Many authors used VHR data to perform the UFZ classification [11][12][13]. Zhang et al. [11] proposed a Hierarchical Semantic Cognition (HSC) method to establish four semantic layers, i.e., visual features, object categories, spatial patterns of objects, and zone functions. They used their hierarchical relations to identify UFZs and found that the HSC method yields a good accuracy for UFZ mapping (the overall accuracy was 90.8%). Further, Zhang et al. [1] performed a top-down feedback method-Inverse Hierarchical Semantic Cognition (IHSC) to optimize the initial HSC results, and they found that the IHSC increased the Overall Accuracy (OA) from 84.0 to 90.5%. Recently, authors utilized the IHSC increased the Overall Accuracy (OA) from 84.0 to 90.5%. Recently, authors utilized the Point of Interest (POI) data for UFZ mapping. For instance, Hu et al. [12] generated parcel information using the road networks and integrated Landsat 8 Operational Land Imager images and POI data to classify parcels into eight functional zones (level I, e.g., residential, commercial, industrial, and institutional areas) and 16 land covers (Level II). They found that the OA value of Level I classification was 81.04%. Besides, Zhou et al. [13] proposed a Super Object-Convolutional Neural Network (SO-CNN) method to conduct UFZ classification. They used the POI data to identify four UFZs, i.e., commercial office, urban green, industrial warehouse, and residential zones in Hangzhou city, China and found that the classification results are refined with an OA value of 91.1%. However, previous studies did not explore the impacts of three-dimensional (3D) urban structure parameters (USPs), e.g., building height (BH) and sky view factor (SVF) on UFZ detection.
It is noteworthy that 3D USPs play distinctive roles in describing urban layouts and constructions [14]. For example, an investigation from the northern part of Brooklyn, New York City, USA, shows that industrial zones usually locate on an open ground surface, resulting in higher values of SVF than residential and commercial zones (Figure 1a). Generally, industrial zones feature low-rise and large buildings, thus often have low BHs (Figure 1b). In addition, different Street Aspect Ratios (SAR) and Floor Area Ratios (FAR) were observed among different functional zones (Figure 1c,d). Hence, it is of importance to consider 3D USPs for UFZ mapping. Recently, Light Detection And Ranging (LiDAR) technology rise as it can provide a fast and straightforward approach to acquiring the height information of underlying surfaces [15]. Thus, LiDAR technology can be regarded as a feasible approach to extract 3D USPs [14]. Note that it is challenging to extract UFZs directly from the VHR images because spectral, textural, and geometrical features can only be effective in segment objects instead of identifying UFZs [11]. As one of the essential elements of UFZs, the components and configurations of land covers exert significant influences on measuring and analyzing UFZs [16]. Thus, it is important to perform the land cover classification for the subsequent UFZ mapping.  In this study, a new approach that integrates multiple machine learning algorithms and 3D USPs is introduced for UFZ mapping. The objectives of the study are:

•
To integrate multi-machine learning algorithms and various features, primarily 3D USPs, for enhancing land cover mapping; • To perform UFZ mapping by coupling 3D USPs and multi-classifiers (MLCs); • To evaluate the influence of 3D USPs on the classifications of both land covers and UFZs.

Study Area
Our study area lies in the northern part of Brooklyn, New York City, USA ( Figure 2), with an area of 6.12 km 2 . The eastern and northern parts of the area are along the East River. The area has 8779 buildings, 8146 parcels, and 493 blocks. In addition, the area includes four typical UFZs, i.e., commercial, residential, industrial, and park zones [10]. In this study, a new approach that integrates multiple machine learning algorithms and 3D USPs is introduced for UFZ mapping. The objectives of the study are: • To integrate multi-machine learning algorithms and various features, primarily 3D USPs, for enhancing land cover mapping; • To perform UFZ mapping by coupling 3D USPs and multi-classifiers (MLCs); • To evaluate the influence of 3D USPs on the classifications of both land covers and UFZs.

Study Area
Our study area lies in the northern part of Brooklyn, New York City, USA ( Figure 2) with an area of 6.12 km 2 . The eastern and northern parts of the area are along the East River. The area has 8779 buildings, 8146 parcels, and 493 blocks. In addition, the area includes four typical UFZs, i.e., commercial, residential, industrial, and park zones [10].

Data
The primary data used in the study includes VHR images, LiDAR point clouds, road networks, and land-lot information.

Data
The primary data used in the study includes VHR images, LiDAR point clouds, road networks, and land-lot information.
• VHR images The high-resolution orthophotos of the study area were acquired from the New York City Office of Information Technology Services [17] (Figure 2b). The images consist of four bands (i.e., blue, green, red, and near-infrared bands) with a 0.3 m (1.0 ft) spatial resolution, which provides rich spectral information for the classification of land covers and UFZs.

•
LiDAR point clouds The point cloud data was acquired in May 2017 and collected using a Cessna 402 C or Cessna Caravan 208B aircraft equipped with Leica ALS80 and Riegl VQ-880-G laser systems. The data is released by the New York City Department of Information Technology and Telecommunications (NYCDITT) [18]. Furthermore, in order to generate an accurate Digital Surface Model (DSM), we eliminated the noise points (i.e., outliers and isolated points) using the "StatisticalOutlierRemove" filter operation of Point Cloud Library 1.6, and a voxel grid filter was adopted to reduce the redundant points. After filtering, the density of point clouds is about 8.0 points/m 2 [14].

Road networks and land-lot information
The road networks were obtained from Open Street Map (OSM) in 2017. We used the networks to delineate the blocks' boundaries, and the block was regarded as a basic unit for UFZ mapping [19,20]. In addition, the land-lot data, released by NYCDITT, provides the basic information of land functions. Thus, it could be used to label the blocks' functional attributes and provide the ground reference. Finally, 493 zones were generated from the road networks and land-lot data. Figure 3 shows the workflow of the UFZ classification, including two key steps: (1) Land cover mapping: a method ensemble multi-machine learning algorithms and 3D USPs was proposed for enhancing land cover mapping; in details, we extracted the spectral, textural, geometrical features, and 3D USPs of objects that were determined by the multi-resolution image segmentation; further, the feature optimization was employed by using the method of mean decrease impurity; at last, we selected the best classifier from three machine learning methods, i.e., Random Forest (RF), K-Nearest Neighbor (KNN), and Linear Discriminant Analysis (LDA), to label those objects. (2) UFZ mapping: a new method that integrates urban spatial information of both two-dimensional (2D) USPs, 3D USPs and MLCs were utilized for UFZ mapping; in details, we extracted the 2D USPs from the land cover mapping results and 3D USPs from LiDAR point clouds; then, Nearest Neighbor Index (NNI) was used to identify three spatial patterns of land covers, i.e., random distribution, aggregation, and uniform distribution; finally, we chose the classifier with the best performance to conduct UFZ classification.

Multi-Feature Extraction
To avoid the "salt and pepper" phenomenon of land cover classification, firstly, multi-resolution segmentation was used to segment the VHR images into multi-scale objects, which was performed by using eCognition software. In particular, the Estimation of Scale Parameter (ESP) tool [21][22][23][24][25][26] was used to generate the appropriate scale of land cover object. Secondly, we extracted four categories of features, i.e., spectral, textural, geometrical features and 3D USP, for the subsequent land cover labeling ( Table 1). As shown, the spectral features included spectral information (i.e., red, blue, green, and nearinfrared bands), Normalized Difference Vegetation Index (NDVI), Ratio Vegetation Index (RVI), Difference Vegetation Index (DVI), Normalized Difference Water Index (NDWI), Meani, Brightness, Ratio, Mean diff. to neighbor (Mean. diff.), Standard Deviation (Std. Dev). The textural features were revealed by different indices from the Gray-Level Cooccurrence Matrix (GLCM), i.e., angular second moment, variance, contrast, entropy, energy, correlation, inverse differential moment, dissimilarity, and homogeneity. The geometrical features were used to reveal geometrical characteristics of objects, i.e., area, border length, length/width, compactness, asymmetry, border index, density, elliptic fit, main direction, shape index. The spectral, textural, and geometrical features are widely used in object-based image research [27,28]. In particular, we selected three features for 3D USPs, including the Digital Surface Model (DSM), Sky View Factor (SVF), and flatness, all of them were extracted from LiDAR point clouds.

Multi-Feature Extraction
To avoid the "salt and pepper" phenomenon of land cover classification, firstly, multiresolution segmentation was used to segment the VHR images into multi-scale objects, which was performed by using eCognition software. In particular, the Estimation of Scale Parameter (ESP) tool [21][22][23][24][25][26] was used to generate the appropriate scale of land cover object. Secondly, we extracted four categories of features, i.e., spectral, textural, geometrical features and 3D USP, for the subsequent land cover labeling ( Table 1). As shown, the spectral features included spectral information (i.e., red, blue, green, and nearinfrared bands), Normalized Difference Vegetation Index (NDVI), Ratio Vegetation Index (RVI), Difference Vegetation Index (DVI), Normalized Difference Water Index (NDWI), Mean i , Brightness, Ratio, Mean diff. to neighbor (Mean. diff.), Standard Deviation (Std. Dev). The textural features were revealed by different indices from the Gray-Level Cooccurrence Matrix (GLCM), i.e., angular second moment, variance, contrast, entropy, energy, correlation, inverse differential moment, dissimilarity, and homogeneity. The geometrical features were used to reveal geometrical characteristics of objects, i.e., area, border length, length/width, compactness, asymmetry, border index, density, elliptic fit, main direction, shape index. The spectral, textural, and geometrical features are widely used in object-based image research [27,28]. In particular, we selected three features for 3D USPs, including the Digital Surface Model (DSM), Sky View Factor (SVF), and flatness, all of them were extracted from LiDAR point clouds. The ratio of the short axis to the long axis of an approximate ellipse of image objects

Border Index
The ratio of the perimeter of image object to the perimeter of the object's MBR. Density The ratio of area to radius of image objects The fitting degree of eclipse fit Main Direction Eigenvectors of covariance matrix of image objects Shape Index The ratio of perimeter to four times side length 3D USP Digital Surface Model (DSM, Figure 2c) DSM was produced by using an interpolation algorithm (i.e., binning approach) with all points. [39] Sky View Factor (SVF, Figure 2d) Sky view factor refers to the visible degree of sky in the ground level and its values vary from 0 to 1. 0 refers to the sky is not visible; in contrast, 1 refers to the sky is completely visible. [40] Flatness (the details can be found in Supplementary Material Figure S1) Flatness derived from DSM and refers to the flatness of the non-ground points. The points were generated by using the "lasground" filter operation in the LAStools. [14,41] Based on the spectral response of features on VHR images and investigation, six land covers were identified, including buildings, trees, grasses, soil lands, impervious grounds, and water bodies. Table 2 provides the details of the samples used for different land covers, and the training samples are randomly selected by using the "model_selection" tool in the sklearn package. As shown, 8/10 of the training samples were randomly selected and used for classification, and the others were chosen for the accuracy assessment.

Feature Optimization
Feature optimization provides a better understanding of the feature importance and is crucial to improve the classification accuracy. We used the Gini Index (GI) (i.e., Mean Decrease Impurity) to measure the importance of each variable. GI was calculated from the structure of the RF classifier, representing the average degree of error reduction by each feature, and can be defined as [42,43]: where GI (P) is the GI value, k represents the kth classes, and P k represents the probability that the sample belongs to the k class. Generally, a higher GI value means the corresponding variable exerts more influence on the classification. Details of feature optimization steps can be found in [44].
(1) RF classifier consists of multiple decision trees and can utilize different trees to train samples and predict results. In particular, each tree would yield its predicted result. Then, by counting the vote results in different decision trees, RF integrates their vote results to predict the final results. Therefore, the RF model can significantly improve the classification results compared with a single decision tree. In addition, RF has a good performance for the outlier as well as noise and can effectively avoid overfitting [3,49]. (2) KNN classifier measures the weight of its neighbors when performs a new instance. The classifier labeled objects with different categories according to the weight and is more suitable than other classifiers when the class fields overlaps in the sample set [50]. (3) LDA classifier projects the training sample on a straight line to make project objects of the same class as close as possible; in contrast, heterogeneous sample projection points away from as far as possible. The classifier assumes that all data sets are followed by a normal distribution and can reduce the dimensions of the original data. LDA classifier calculates the probability density of each class sample, and the classification results depend on the maximum probability of each category [51,52].

Classification Post-Processing and Accuracy Evaluation
Classification post-processing is a pivotal step to optimize the classification result [53,54]. We used the rules of classification post-processing for the objects (Table 3). In addition, we used a confusion matrix to evaluate the accuracy of land cover classification [55]. Three indices, including Overall Accuracy (OA), Producer's Accuracy (PA), and User's Accuracy (UA), were utilized to quantify the accuracy of classification results.

Feature Extraction
We utilized three feature categories, including 2D USP, 3D USP, and spatial pattern of land covers, to assist the following UFZ classification ( Table 4). The 2D USPs described the landscape compositions in different UFZs. They were revealed by building coverage (BC), tree coverage (TC), grass coverage (GC), soil coverage (SC), impervious surface coverage at ground level (ISC_G), and water coverage (WC). We extracted 3D USPs by integrating results of the land cover classification and LiDAR point clouds. For example, we obtained building labels from the results of land cover classification, and then calculated the average building height by using the height information from the LiDAR data [56]. The 3D USPs including sky view factor (SVF), building height (BH), street aspect ratio (SAR), and floor area ratio (FAR).
Previous studies have demonstrated that the NNI can measure the spatial patterns of land covers, and thus help the UFZ mapping [57][58][59]. To avoid such biases as the same landscape compositions or 3D USPs in different UFZs, we introduced the Nearest Neighbor Index (NNI) to improve the UFZ classification using different spatial patterns of land covers [60]. We considered three typical spatial patterns of land covers, i.e., random distribution, aggregation, and uniform distribution. The NNI can be defined as: where d min is the distance between a specific land cover (i.e., a building) and its nearest same object, and d min is the average distance of d min in a block. E(d min ) is the expectation of d min in complete space randomness mode, which is calculated based on the area of the block (A) and the number of buildings (n). Modes of random distribution, aggregation, and uniform distribution were identified when NN I value = 1, NN I value < 1, and NN I value > 1.

Experiment Design
We determined the feature optimization using the GI method and selected the best combination of features for UFZ mapping (details can be found in Section 3.2.2). To further analyze the influences of 3D USP on UFZ mapping, we designed seven experiments with different variable combinations based on the results of feature optimization (Table 5). In this way, we tried to recognize the most significant feature category for UFZ mapping. As shown, Exps. a, b, and c featured 2D USP, 3D USP, and spatial pattern feature, respectively, and were meant to examine the ability of a single category in the UFZ mapping. Exp. d consisted of fusion categories involving 2D USP, 3D USP and was designed to identify 2D and 3D USP combined effect on the UFZ mapping. Exp. e selected 2D USP and spatial pattern feature as input variables. In addition, Exp. f consisted of mixed categories involving 3D USP and spatial pattern features. Exps. f and g differed in that Exp. f did not contain 2D USP, while Exp. g included all the categories. In addition, 307 zones were selected as training samples and used for classification, and the others were chosen for the accuracy assessment.  Figure 4 shows the importance ranking of 44 variables for land cover classification. As shown, firstly, DSM reached the highest variable importance value (GI value = 0.057) in the classification, and SVF also yielded a high GI value (0.042), indicating that 3D USPs exert considerable influences on the land cover classification. In particular, the GI value of NDVI was 0.050, suggesting that vegetation coverage is vital for land cover mapping. The spectral features showed better performance in classification (all their GI values were higher than 0.020); in contrast, the GI values of textural features were relatively low (most of their GI values was lower than 0.02). Limited influence of geometrical features, i.e., asymmetry, area, and compactness, on the classifications was observed. In summary, the variable importance ranked from high to low was 3D USPs > spectral features > geometrical features > textural features.  Figure 4 shows the importance ranking of 44 variables for land cover classification. As shown, firstly, DSM reached the highest variable importance value (GI value = 0.057) in the classification, and SVF also yielded a high GI value (0.042), indicating that 3D USPs exert considerable influences on the land cover classification. In particular, the GI value of NDVI was 0.050, suggesting that vegetation coverage is vital for land cover mapping. The spectral features showed better performance in classification (all their GI values were higher than 0.020); in contrast, the GI values of textural features were relatively low (most of their GI values was lower than 0.02). Limited influence of geometrical features, i.e., asymmetry, area, and compactness, on the classifications was observed. In summary, the variable importance ranked from high to low was 3D USPs > spectral features > geometrical features > textural features. We further tested the varying OA values associated with different input variables ( Figure 5). An increasing trend and followed by stable OA values were observed with the rising number of input variables. The highest OA value (87.4%) was observed when the number of the input variables was 24. Thus, we selected the 24 variables to perform the subsequent land cover mapping (details of the 24 optimal variables can be found in Supplementary Material Table S2). We further tested the varying OA values associated with different input variables ( Figure 5). An increasing trend and followed by stable OA values were observed with the rising number of input variables. The highest OA value (87.4%) was observed when the number of the input variables was 24. Thus, we selected the 24 variables to perform the subsequent land cover mapping (details of the 24 optimal variables can be found in Supplementary Material Table S2). Remote Sens. 2021, 13, x FOR PEER REVIEW 11 of 21  Table 6 shows land cover classification accuracy using three classifiers, i.e., RF, KNN, and LDA. As shown, the RF classifier had the highest accuracy with an OA value of 87.4% (details of its confusion matrix see Supplementary Material Table S3). In contrast, the lowest accuracy of the classification was observed by using the LDA classifier (the OA value was 74.0%). LDA was failed to distinguish trees from grasses and buildings from groundlevel impervious surfaces (details of its confusion matrix see Supplementary Material Table S4). Regarding the KNN classifier, its OA value was 77.4%, and it wrongly classified the tree and grass (details of its confusion matrix see Supplementary Material Table S5). 74.0 Figure 6 shows the details of land cover classification using different MLCs. As shown, the RF classifier can better capture small trees (location g in Figure 6b). Moreover, the RF classifier can effectively identify building boundaries (location h in Figure 6b). However, the KNN classifier could not depict the building boundaries clearly, and wrongly classified the shape of trees (location g in Figure 6c). The LDA classifier failed to capture small trees; meanwhile, some building boundaries were poorly detected using the LDA classifier (location h Figure 6d).  Table 6 shows land cover classification accuracy using three classifiers, i.e., RF, KNN, and LDA. As shown, the RF classifier had the highest accuracy with an OA value of 87.4% (details of its confusion matrix see Supplementary Material Table S3). In contrast, the lowest accuracy of the classification was observed by using the LDA classifier (the OA value was 74.0%). LDA was failed to distinguish trees from grasses and buildings from ground-level impervious surfaces (details of its confusion matrix see Supplementary Material Table S4). Regarding the KNN classifier, its OA value was 77.4%, and it wrongly classified the tree and grass (details of its confusion matrix see Supplementary Material Table S5).  Figure 6 shows the details of land cover classification using different MLCs. As shown, the RF classifier can better capture small trees (location g in Figure 6b). Moreover, the RF classifier can effectively identify building boundaries (location h in Figure 6b). However, the KNN classifier could not depict the building boundaries clearly, and wrongly classified the shape of trees (location g in Figure 6c). The LDA classifier failed to capture small trees; meanwhile, some building boundaries were poorly detected using the LDA classifier (location h Figure 6d). Remote Sens. 2021, 13, x FOR PEER REVIEW 12 of 21  Table 7 shows the comparison of land-cover classification by using 2D USPs and 3D USPs against those using only 2D USPs (details regarding the confusion matrixes of each classifier can be found in Supplementary Material Tables S6-S8). As shown, the accuracies of all classifiers were increased after adding 3D USPs. The OA values of RF, KNN, and LDA classifiers increased by 3.1, 2.3, and 3.5%, respectively. Further, it is found that, after 3D USPs involved in classification, PA values of all land covers increased, but different increasing degrees of varying land covers were observed. Tree yielded the highest increased accuracy with an average increased PA value of 6.7%. The possible reason is that the height information can be used to distinguish trees from grasses. In addition, 3D USPs can increase the PA value of buildings (increased by 3.3%). The possible reason is that the SVF can distinguish building roofs (featuring a high SVF) from impervious grounds (feature a low SVF). In summary, 3D USPs significantly increased the accuracy of land cover mapping. These findings were consistent with the previous studies by [14,64].  Figure 7 compares spatial details of land-cover classification by using 2D-3D USPs against those by using only 2D USPs. It is found that the classification results using 2D-3D USPs were better than those using only 2D USPs. As shown in Figure 7a, the trees in the blue circle were mistakenly classified as impervious ground, and the building boundaries cannot be captured by using 2D USPs (red rectangle). However, both were captured after adding 3D USPs (Figure 7b), indicating that 3D USPs can help distinguish trees from  Table 7 shows the comparison of land-cover classification by using 2D USPs and 3D USPs against those using only 2D USPs (details regarding the confusion matrixes of each classifier can be found in Supplementary Material Tables S6-S8). As shown, the accuracies of all classifiers were increased after adding 3D USPs. The OA values of RF, KNN, and LDA classifiers increased by 3.1, 2.3, and 3.5%, respectively. Further, it is found that, after 3D USPs involved in classification, PA values of all land covers increased, but different increasing degrees of varying land covers were observed. Tree yielded the highest increased accuracy with an average increased PA value of 6.7%. The possible reason is that the height information can be used to distinguish trees from grasses. In addition, 3D USPs can increase the PA value of buildings (increased by 3.3%). The possible reason is that the SVF can distinguish building roofs (featuring a high SVF) from impervious grounds (feature a low SVF). In summary, 3D USPs significantly increased the accuracy of land cover mapping. These findings were consistent with the previous studies by [14,64].  Figure 7 compares spatial details of land-cover classification by using 2D-3D USPs against those by using only 2D USPs. It is found that the classification results using 2D-3D USPs were better than those using only 2D USPs. As shown in Figure 7a, the trees in the blue circle were mistakenly classified as impervious ground, and the building boundaries cannot be captured by using 2D USPs (red rectangle). However, both were captured after adding 3D USPs (Figure 7b), indicating that 3D USPs can help distinguish trees from impervious grounds and delineate the building boundaries. In addition, some impervious grounds are mistakenly recognized as buildings (the red circle in Figure 7e), and some trees were poorly captured (blue circle in Figure 7e) using only 2D USPs. In contrast, Figure 7f shows that the impervious grounds and trees were correctly classified using 2D USPs and 3D USPs. impervious grounds and delineate the building boundaries. In addition, some impervious grounds are mistakenly recognized as buildings (the red circle in Figure 7e), and some trees were poorly captured (blue circle in Figure 7e) using only 2D USPs. In contrast, Figure 7f shows that the impervious grounds and trees were correctly classified using 2D USPs and 3D USPs.  Figure 8 demonstrates the variable importance ranking of 16 variables in UFZ mapping. As shown, the most critical variable in UFZ mapping was SVF, which reached the highest GI value of 0.138. Besides, SAR also had a high GI value of 0.131 and performed well in UFZ mapping, indicating the importance of 3D USPs in UFZ mapping (note that all the GI values of 3D USPs were higher than 0.06). In addition, TC reached the secondhighest GI value in UFZs (GI value = 0.134), suggesting that tree coverage plays an essential role in UFZ classification. The other 2D USPs (i.e., BC, ISC_G, and GC) were also crucial for UFZ mapping since their GI values were higher than 0.06. Moreover, the spatial patterns of land covers occupied a relatively low importance rank in UFZ mapping, and all their GI values were lower than 0.060. As shown, the most critical variable in UFZ mapping was SVF, which reached the highest GI value of 0.138. Besides, SAR also had a high GI value of 0.131 and performed well in UFZ mapping, indicating the importance of 3D USPs in UFZ mapping (note that all the GI values of 3D USPs were higher than 0.06). In addition, TC reached the second-highest GI value in UFZs (GI value = 0.134), suggesting that tree coverage plays an essential role in UFZ classification. The other 2D USPs (i.e., BC, ISC_G, and GC) were also crucial for UFZ mapping since their GI values were higher than 0.06. Moreover, the spatial patterns of land covers occupied a relatively low importance rank in UFZ mapping, and all their GI values were lower than 0.060. Remote Sens. 2021, 13, x FOR PEER REVIEW 14 of 21 Figure 8. Variable importance ranking for UFZ mapping revealed by the RF algorithm.

Results of Feature Optimization
We also separately tested the variable importance of each category of features. Figure  9a-c show the results of the variable importance of 2D USPs, 3D USPs, and spatial patterns. The most crucial variable in 2D USPs was TC (Figure 9a), which reached a high GI value of 0.32, indicating that tree coverage plays an essential role in UFZ classification. Besides, building coverage yielded a high GI value of 0.25. However, SC and WC showed a relatively low variable importance with GI values of less than 0.02. Figure 9b shows that SVF (GI value = 0.29) and SAR (GI value = 0.28) yielded good performance in UFZ mapping. Notably, all the GI values of 3D USPs exceeded 0.20, suggesting that 3D USPs are indispensable for UFZ mapping. Figure 9c shows that the top 2 important spatial pattern features were: TNNI (GI values = 0.30) and BNNI (GI values = 0.29). In addition, the variable importance ranking of spatial patterns from high to low was TNNI > BNNI > INNI > GNNI > SNNI > WNNI.  Figure 10 shows the changes in the OA value with different input variables. As shown, the OA value increased rapidly when the number of variables changed from 1 to 4, and it reached the highest accuracy of 91.9% when the number of input variables was 14. Therefore, the 14 variables of the optimal combination were selected to perform the We also separately tested the variable importance of each category of features. Figure 9a-c show the results of the variable importance of 2D USPs, 3D USPs, and spatial patterns. The most crucial variable in 2D USPs was TC (Figure 9a), which reached a high GI value of 0.32, indicating that tree coverage plays an essential role in UFZ classification. Besides, building coverage yielded a high GI value of 0.25. However, SC and WC showed a relatively low variable importance with GI values of less than 0.02. Figure 9b shows that SVF (GI value = 0.29) and SAR (GI value = 0.28) yielded good performance in UFZ mapping. Notably, all the GI values of 3D USPs exceeded 0.20, suggesting that 3D USPs are indispensable for UFZ mapping. Figure 9c shows that the top 2 important spatial pattern features were: TNNI (GI values = 0.30) and BNNI (GI values = 0.29). In addition, the variable importance ranking of spatial patterns from high to low was TNNI > BNNI > INNI > GNNI > SNNI > WNNI. We also separately tested the variable importance of each category of features. Figure  9a-c show the results of the variable importance of 2D USPs, 3D USPs, and spatial patterns. The most crucial variable in 2D USPs was TC (Figure 9a), which reached a high GI value of 0.32, indicating that tree coverage plays an essential role in UFZ classification. Besides, building coverage yielded a high GI value of 0.25. However, SC and WC showed a relatively low variable importance with GI values of less than 0.02. Figure 9b shows that SVF (GI value = 0.29) and SAR (GI value = 0.28) yielded good performance in UFZ mapping. Notably, all the GI values of 3D USPs exceeded 0.20, suggesting that 3D USPs are indispensable for UFZ mapping. Figure 9c shows that the top 2 important spatial pattern features were: TNNI (GI values = 0.30) and BNNI (GI values = 0.29). In addition, the variable importance ranking of spatial patterns from high to low was TNNI > BNNI > INNI > GNNI > SNNI > WNNI.  Figure 10 shows the changes in the OA value with different input variables. As shown, the OA value increased rapidly when the number of variables changed from 1 to 4, and it reached the highest accuracy of 91.9% when the number of input variables was 14. Therefore, the 14 variables of the optimal combination were selected to perform the  Figure 10 shows the changes in the OA value with different input variables. As shown, the OA value increased rapidly when the number of variables changed from 1 to 4, and it reached the highest accuracy of 91.9% when the number of input variables was 14. Therefore, the 14 variables of the optimal combination were selected to perform the UFZ mapping (the details of 14 optimal variables can be found in Supplementary Material Table S9). UFZ mapping (the details of 14 optimal variables can be found in Supplementary Material Table S9).  Table 8 shows the accuracy of UFZ classification revealed by different classifiers (Details concerning the confusion matrix of three classifiers are shown in Supplementary Material Tables S10-S12). It is found that the RF classifier generated the highest accuracy results of UFZ mapping with an OA value of 91.9%. For example, the results of commercial zone classification show that the highest accuracy was found in the RF classifier (PA value = 88.9%), but low accuracies were observed in KNN (PA value = 64.4%) and LDA (PA value = 80.0%) algorithms. In summary, the RF classifier produced more accurate results and had more tremendous advantages in identifying UFZs than KNN and LDA classifiers do. The results of UFZ classification revealed by RF, KNN, and LDA classifiers are shown in Figure 11a-c. We selected three sub-regions to show their differences in spatial classification details (Figure 11d-f). Figure 11d gives an example of a residential site. We found that the site could be well recognized using the RF classifier; however, it was wrongly classified as an industrial zone using KNN or LDA classifiers. Likewise, Figure 11e gives an example of a commercial site and found that the RF classifier could accurately classify the place; yet, the KNN classifier mistakenly classified it as a residential zone and the LDA classifier wrongly recognized it as an industrial zone.

Results of UFZ Mapping
Moreover, Figure 11f shows an industrial zone with irregularly disturbed buildings and some large trucks. It is found that the RF classifier recognized the site well, but it was incorrectly classified as a residential zone using the KNN or LDA classifiers. Thus, based on the above analyses, the RF classifier was more suitable for UFZ classification.  Table 8 shows the accuracy of UFZ classification revealed by different classifiers (Details concerning the confusion matrix of three classifiers are shown in Supplementary Material Tables S10-S12). It is found that the RF classifier generated the highest accuracy results of UFZ mapping with an OA value of 91.9%. For example, the results of commercial zone classification show that the highest accuracy was found in the RF classifier (PA value = 88.9%), but low accuracies were observed in KNN (PA value = 64.4%) and LDA (PA value = 80.0%) algorithms. In summary, the RF classifier produced more accurate results and had more tremendous advantages in identifying UFZs than KNN and LDA classifiers do. The results of UFZ classification revealed by RF, KNN, and LDA classifiers are shown in Figure 11a-c. We selected three sub-regions to show their differences in spatial classification details (Figure 11d-f). Figure 11d gives an example of a residential site. We found that the site could be well recognized using the RF classifier; however, it was wrongly classified as an industrial zone using KNN or LDA classifiers. Likewise, Figure 11e gives an example of a commercial site and found that the RF classifier could accurately classify the place; yet, the KNN classifier mistakenly classified it as a residential zone and the LDA classifier wrongly recognized it as an industrial zone.  Table S13). As show firstly, Exp. g yielded the highest accuracy with an OA value of 91.9%, suggesting t advantages of integrating 2D USPs, 3D USPs and spatial pattern features of land cove In contrast, Exp. c produced the worst classification results (OA value = 67.7%), suggesti the disadvantages of using only spatial patterns of land covers for UFZ mapping. S ondly, using combinations of different categories was generally better than using the s gle category (e.g., Exps. a VS. d and Exps. b VS. e). However, it is found that the OA va of Exp. a was higher than that of Exp. f, suggesting that 2D USPs are indispensable for t UFZ classification. Moreover, Figure 11f shows an industrial zone with irregularly disturbed buildings and some large trucks. It is found that the RF classifier recognized the site well, but it was incorrectly classified as a residential zone using the KNN or LDA classifiers. Thus, based on the above analyses, the RF classifier was more suitable for UFZ classification.  Table S13). As shown, firstly, Exp. g yielded the highest accuracy with an OA value of 91.9%, suggesting the advantages of integrating 2D USPs, 3D USPs and spatial pattern features of land covers. In contrast, Exp. c produced the worst classification results (OA value = 67.7%), suggesting the disadvantages of using only spatial patterns of land covers for UFZ mapping. Secondly, using combinations of different categories was generally better than using the single category (e.g., Exps. a VS. d and Exps. b VS. e). However, it is found that the OA value of Exp. a was higher than that of Exp. f, suggesting that 2D USPs are indispensable for the UFZ classification.

Results of UFZ Mapping
advantages of integrating 2D USPs, 3D USPs and spatial pattern features of land covers. In contrast, Exp. c produced the worst classification results (OA value = 67.7%), suggesting the disadvantages of using only spatial patterns of land covers for UFZ mapping. Secondly, using combinations of different categories was generally better than using the single category (e.g., Exps. a VS. d and Exps. b VS. e). However, it is found that the OA value of Exp. a was higher than that of Exp. f, suggesting that 2D USPs are indispensable for the UFZ classification.  To explore the impacts of 3D USPs on UFZ mapping, we compared the two pairs of experiments (i.e., Exps. a vs. d and Exps. e vs. g) in Figure 13. As shown, the experiments with 2D USPs and 3D USPs produced more accurate classification results than those with only 2D USPs. For example, block a in Figure 13 is an industrial zone with low-rise and large buildings; however, it was incorrectly classified as a park in Exp. a. Similarly, residential zone b was incorrectly identified as a commercial zone in Exp. a. Yet, it is noteworthy that blocks a and b were correctly classified in Exp. d by using 3D USPs. The possible reason is that SAR and FAR can help with UFZ mapping. SARs in industrial zones are higher than those in parks (Figure 1c), and FARs in residential zones are higher than those in commercial zones (Figure 1d). In addition, blocks c and d were correctly identified as the industrial and commercial zones, respectively, in Exp. g, whereas they were wrongly labeled as residential zones in Exp. e that did not contain 3D USPs. Our results verified that the importance of 3D for UFZ classification. To explore the impacts of 3D USPs on UFZ mapping, we compared the two pairs of experiments (i.e., Exps. a vs. d and Exps. e vs. g) in Figure 13. As shown, the experiments with 2D USPs and 3D USPs produced more accurate classification results than those with only 2D USPs. For example, block a in Figure 13 is an industrial zone with low-rise and large buildings; however, it was incorrectly classified as a park in Exp. a. Similarly, residential zone b was incorrectly identified as a commercial zone in Exp. a. Yet, it is noteworthy that blocks a and b were correctly classified in Exp. d by using 3D USPs. The possible reason is that SAR and FAR can help with UFZ mapping. SARs in industrial zones are higher than those in parks (Figure 1c), and FARs in residential zones are higher than those in commercial zones (Figure 1d). In addition, blocks c and d were correctly identified as the industrial and commercial zones, respectively, in Exp. g, whereas they were wrongly labeled as residential zones in Exp. e that did not contain 3D USPs. Our results verified that the importance of 3D for UFZ classification.

Discussion
Previous studies use only 2D features, e.g., spectral, textural, and geometrical features to perform UFZ mapping [11][12][13], whereas our approach first considered the potentials of 3D USPs, i.e., BH, SVF, FAR, and SAR for the UFZ mapping. It is important to introduce 3D USPs for the UFZ mapping since different UFZs yield essentially 3D heterogeneity (Figure 1). Our results verified that 3D USPs could considerably improve the OA values of UFZ and land cover mapping by 6.4% and 3.0% (Table 8 and Figure 12).
To better illustrate the advantages of our approach, we further compare the proposed approach with relevant literature on the data source and evaluated results, i.e., OA value (Table 9). First, our approach obtains more refined evaluation results than most of the considered strategies (OA value = 91.9%). Second, the main idea of UFZ mapping is to fully use the differences in spectral features [12], textual features [64], the spatial pattern of objects [1,11], and 3D USPs among various UFZs. Note that different strategies and indicators may yield extra costs. Our approach needs VHR images and LiDAR point clouds producing additional expenses, i.e., 3D data. Yet, nowadays, accurate 3D data is more accessible, i.e., low-cost LiDAR and terrestrial LiDAR systems are becoming more affordable [14]. Based on the above discussion, our approach is suitable for accurate UFZ mapping due to better accuracy and affordable costs. As stated earlier, this study achieved an accurate method for UFZ classification by considering 3D USPs. However, several limitations need to be noted. First, this study highlights the distinctive role of 3D USPs in UFZ mapping. However, cities composed of complex and various landscapes require more 3D variables, i.e., 3D spatial patterns, to label the objects to obtain accurate UFZ classification results. Second, the UFZs relevant to socioeconomic events, and people usually conduct different activities in various UFZs. The open social data related to human activity (i.e., POI, public transport data, mobile phone positioning data) are valuable for UFZ mapping [11,12,64]). Therefore, studies that explore additional available features and integrate open social data for more accurate UFZ mapping are required in the future.

Conclusions
In this study, we proposed a new approach for UFZ mapping by integrating 2D USPs, 3D USPs and the spatial patterns of land covers. The approach was then verified in Brooklyn, New York City, USA. We evaluated the influence of 3D USPs on the classifications of land covers and UFZs. The conclusions can be drawn as follows.
Our results show that the approach yielded an excellent accuracy of UFZ mapping with an overall accuracy of 91.9%. The RF classifier produced the highest accuracies of both land cover and UFZ classifications. In addition, 3D USPs considerably improved the classification accuracy of land cover by increasing the average OA value of 3.0% and can help the UFZ recognition, which improved the accuracy of UFZ mapping (increasing the OA value by 6.4%). Moreover, we verified DSM was the most critical variable of 44 features in land cover mapping, which obtained the GI value as 0.057. In addition, SVF was the top importance variable for UFZ classification with a GI value of 0.138. Our research provides a new perspective for UFZ mapping and highlights that 3D USPs should be considered in future studies that perform UFZ mapping.