Building Extraction from Airborne LiDAR Data Based on Min-Cut and Improved Post-Processing

: Building extraction from LiDAR data has been an active research area, but it is difficult to discriminate between buildings and vegetation in complex urban scenes. A building extraction method from LiDAR data based on minimum cut (min-cut) and improved post-processing is proposed. To discriminate building points on the intersecting roof planes from vegetation, a point feature based on the variance of normal vectors estimated via low-rank subspace clustering (LRSC) technique is proposed, and non-ground points are separated into two subsets based on min-cut after filtering. Then, the results of building extraction are refined via improved post-processing using restricted region growing and the constraints of height, the maximum intersection angle and consistency. The maximum intersection angle constraint removes large non-building point clusters with narrow width, such as greenbelt along streets. Contextual information and consistency constraint are both used to eliminate inhomogeneity. Experiments of seven datasets, including five datasets provided by the International Society for Photogrammetry and Remote Sensing (ISPRS), one dataset with high-density point data and one dataset with dense buildings, verify that most buildings, even with curved roofs, are successfully extracted by the proposed method, with over 94.1% completeness and a minimum 89.8% correctness at the per-area level. In addition, the proposed point feature significantly outperforms the comparison alternative and is less sensitive to feature threshold in complex scenes. Hence, the extracted building points can be used in various applications.


Introduction
Automatic building extraction from remote sensing data is a prerequisite step for the applications of three-dimensional (3D) building reconstruction, urban planning, disaster assessment, and updating digital maps and geographic information system (GIS) databases [1][2][3][4]. Since the emergence of airborne Light Detection and Ranging (LiDAR), it provides an alternative way to extract buildings due to its high-density and high-accuracy point data.
In the field of LiDAR, building extraction is to separate building points from other points, which can also be termed as building detection [1,5]. Due to the complexity of building structures and urban scenes, it remains a challenge to extract buildings from LiDAR data. Many studies have been and building element structure. However, the size of the pixels and the geometrical properties greatly influence the final results [39]. The energy minimization approach, such as graph cuts algorithm, is a global solution that formulates the building extraction as an optimization problem [12,15]. In [12], a graph was constructed using pixels of a generated Digital Surface Model (DSM) image, and features of points and grids were both used to construct energy function. Finally, pixels were labelled by minimizing the energy function. In [41], the graph was constructed using voxels after voxelization on point cloud. Although satisfying results can be obtained from [12,41], data structure is changed and results may exhibit decreased accuracy in the process of rasterization or voxelization due to the fact that not all points within the same pixel or voxel belong to a building, and post-processing steps are needed to solve the problem of building boundary zigzags [12]. In [15], graph cuts algorithm was used to optimize classification results of JointBoost classifier due to the fact that contextual information between points was not considered in the classifier. However, results of JointBoost classifier were used as initial foreground and background seed points to define energy function. This is impractical for automatic building extraction due to the fact that no building seeds are available before building extraction.
Based on the above statements, the current studies on building extraction still face two problems [24,27]: (1) Large numbers of point features are used to extract buildings in most existing algorithms [13,24,37]. Too many features increase computation cost of calculating point features and reduce algorithmic efficiency. Moreover, not all features are suitable for building extraction and may decrease accuracy [27]. (2) Due to various roof types and complex scenes, the contextual information is useful for classification [12,24]. However, contextual information is always ignored for most methods, leading to inhomogeneous results and increased post-processing workload.
Therefore, it is of great importance to continue building extraction research using airborne LiDAR data. The main objectives of the study are (1) to design effective point feature to discriminate building points on the intersecting roof planes from vegetation, and evaluate its performance with the existing point feature in different scenes and parameter setting; (2) to use few point features to realize building extraction, avoiding spending too much time on point features calculation; (3) to introduce contextual information in the algorithm to reduce inhomogeneity and improve accuracy of extraction results.
The main contributions of this work can be summarized as follows: (1) A point feature based on the variance of accurate normal vectors estimated via low-rank subspace clustering (LRSC) technique is proposed to discriminate building points on the intersecting roof planes from vegetation. The proposed feature significantly outperforms the comparison alternative and is less sensitive to feature threshold in complex scenes. (2) The proposed maximum intersection angle constraint effectively removes large non-building point clusters with narrow width, such as greenbelt points along streets, overcoming the defects of area-based methods in setting area threshold. (3) Contextual information and consistency constraint are both used to eliminate inhomogeneity in the proposed method, which benefits building extraction. (4) Unlike most previous building extraction methods, only two point features are used in the proposed method, which beneficially decreases the computation cost of calculating point features and improves algorithmic efficiency.

Methodology
The proposed method includes three main steps: outliers removal and filtering; point features calculation, graph construction and cut; and improved post-processing. The algorithm workflow is shown in Figure 1. Step 1 Step 2 Step 3 Building points

Outliers Removal and Filtering
The original point cloud data provided by airborne LiDAR system contains some outliers obtained during data collection due to various reasons, such as multipath effect. It is necessary to remove outliers to alleviate their effects on LiDAR data processing. The "StatisticalOutlierRemoval" tool implemented in point cloud library (PCL) is applied to the original point cloud [42]. In the algorithm, the mean distance from each point to its all neighborhoods is calculated, and points whose mean distance are outside a defined range calculated by the global distance mean and standard deviation are removed [43]. After that, denoising points are classified into ground and non-ground points subsets by progressive TIN densification (PTD) [44]. PTD has been widely used in the field of academic community and in engineering applications due to its accuracy and efficiency, and it has been embedded in commercial software, such as TerraSolid and LiDAR_Suite [44,45].

Point Features Calculation and Normalization
After filtering, points can be classified as ground and non-ground points. Non-ground points are generated from laser echoes from buildings, vegetation and other man-made or natural ground objects (e.g., vehicles, wires, etc.) [15,37,40], which are the input for further building extraction process. In general, buildings are considered to be composed of planar patches, while vegetation are non-planar, meaning building points have flat characteristics and vegetation points are rough in local area. Therefore, two features based on the above characteristics are used in the proposed method.

Let
= , , … , denote non-ground points, and = | ∈ , ∈ _ _ represent the point set of k-neighborhoods of . The covariance matrix is constructed using and its neighborhoods , defined as follows: where ̅ = ∑ ∈ is the centroid of all points in , and is the number of . After that, three eigenvalues , , ( ≤ ≤ ) of are calculated via eigen decomposition. The curvature feature is calculated based on three eigenvalues as follows [27]: The curvature feature can describe the flatness of surface, and is widely used for planes extraction [27,37,46]. Although the number of neighborhoods influences , the influence is minimal [47]. Empirically is set to 15 to calculate by referring to [12].

Variance of LRSC-Based Normal Vector Feature ( )
Curvature of points on intersecting building roof planes is much larger than those of flat roofs, indicating these building points are likely to belong to vegetation [12,37]. To discriminate buildings from vegetation, feature based on the variance of normal vectors calculated via principle component analysis (PCA) is proposed in [12]. However, the estimated normal vectors via PCA without modification are inaccurate for points on intersecting building roof planes due to the fact that neighborhoods come from different planes [48]. Figure 2a,b illustrate PCA-based normal vectors of a synthetic cube and building roofs [49], and it can be seen that normal vectors of points on intersecting planes are inaccurate. As a result, features based on these normal vectors, such as variance of normal vector direction [12], and normal vector angle distribution histogram [50], are inaccurate. Considering the above problem, an accurate normal vector estimation method via low-rank subspace clustering (LRSC) [48] is introduced. The normal vector estimation technique has already been used for automatic building roof segmentation from LiDAR data, and experimental results are satisfying [51]. The algorithm is composed of three main steps: First, points around sharp and smooth regions are identified by covariance analysis of their neighborhoods, and their initial normal vectors are estimated via PCA. Second, normal vectors of points' neighborhoods are used as prior knowledge to construct a guiding matrix. Third, neighborhoods are segmented into several isotropic neighborhoods by low-rank subspace clustering (LRSC) with the guiding matrix. Then a consistent sub-neighborhood is used to estimate points' final normal vectors. Figure 2c,d illustrate the LRSC-based normal vectors of a cube and building roof. Compared with the PCA-based normal vectors, LRSC-based normal vectors of points in sharp regions are more accurate and reasonable.
Therefore, feature , based on variance of LRSC-based normal vector, is proposed to discriminate buildings from vegetation, and its calculation includes following sub-steps: Step 1: Normal vectors of all points are calculated via low-rank subspace clustering (LRSC) technique, and calculate the angle between the normal vector and vertical direction ( ⃗ = (0,0,1) in 3D space). It should be noted that the normal vector of points may have opposite direction. Therefore, when is larger than /2, the corresponding angle is set as − .
Step 2: For point P and angles = , , … , of its neighborhoods = , , … , , divide the range [0, /2] into equal bins to construct a dimensional histogram. Then, the number of angles falling within each bin is taken as the value of the bin in the histogram.
Step 3: Calculate variance of histogram as for P with the following formulas: where is the number of neighborhoods, is the number of angles falling within the − ℎ bin.
Step 4: Repeat Step 2 and 3 until all points' are calculated. According to [12], the parameter is empirically set to 60 for each point. has little impact in the range of 5 to 10, and is set to 6 in the proposed method. Figure 3a,b illustrate the feature of [12] and ours respectively in Area 1 of Vaihingen [49], and points are rendered by the value of normalized . Compared with [12], the proposed of points on intersecting building roof planes with large slopes are almost consistent with the ones on smooth roof planes (green rectangles in Figure 3c), and difference of the proposed between buildings and vegetation are more obvious (yellow rectangles in Figure 3c). In addition, it is a challenging task to discriminate some small complex buildings composed of multiple small planar patches from vegetation for of [12] (blue rectangles in Figure 3c).

Feature Normalization
Considering that the range of values of the two features are significantly different, a normalization step is needed. A logistic function is employed to normalize these two features, and the logistic function is defined below: where is the feature threshold and controls the steepness of the logistic function curve. In practice, building extraction results are significantly influenced by , and minimally influenced by [12]. Therefore, is set to −35, 2.0 for and respectively according to [12]. Whereas, the specific values of for and will be analyzed and discussed in later sections.

Graph Construction and Cut
Point segmentation can be viewed as a labeling problem, which is to assign a label from a set of labels to each point by minimizing an objective function [52,53]. For building extraction, the label problem is to assign a building or non-building label to each non-ground point. Generally, a typical representation of the objective function is an energy function with two terms: data term and smooth term. Among a series of optimization methods to minimize energy function, graph cuts approach [54] based on minimum cut (min-cut) shows good performance since it merges, and is commonly used in many applications, such as image segmentation [55][56][57] and point cloud segmentation [12,15,41]. Thus, graph cuts algorithm is adopted to minimize energy function.
In [54], the graph is composed of sets of nodes and edges. It should be noted that there are two special terminal nodes, called source and sink, which represent the "foreground" and "background" labels. In the proposed method, each non-ground point denotes a node. The energy function ( ) is defined as follows: where the first term ∑ ∈ is a data term and ( ) is the penalty to assign label to node . Value of ( ) measures how well the label fits node . The second term ∑ , ∈ , is the smooth term and ( , ) is interpreted as a penalty for discontinuity between nodes and .
Generally, if and are similar, ( , ) is large, which means and more likely belong to the same object. Data penalty ( ) is calculated as follows: where and are the weights of and respectively, and they satisfy + = 1.
Smooth penalty ( , ) is calculated as follows: where and are of and , and are of and . ( , ) is the Euclidean distance between point and . || is any norm distance metric and 1 norm is adopted, is the distance threshold between points and is set as twice the average point space. When the graph is constructed, it is cut based on min-cut and each node will be given a label. Thus, building points are extracted according to the given labels.

Improved Post-Processing
Although most building points are successfully extracted from the non-ground points, some non-building points are wrongly classified as buildings (e.g., vehicles with smooth surfaces and flat overpasses) and some building points are omitted. To solve these aforementioned problems, an improved post-processing is adopted to refine results of building extraction.

Height Constraint
In general, a building should be high enough. Thus, height threshold is set to remove these low points if their absolute height difference between it and its nearest ground points is less than . Under this constraint, partial or possibly whole points of vehicles with smooth surfaces can be excluded due to their low height.
is set according to the average human's height, such as 1.5 m [23].

Restricted Region Growing
It should be noted that some buildings are located on a slope, and some points satisfy the height constraint. Figure 4a illustrates a profile of a building located in a slope, and partial points are classified as non-building points after height constraint. Thus, restricted region growing based on height constraint is conducted to extract omitted buildings. In the process of restricted region growing, the non-building points are classified as buildings if the absolute height difference between them and their nearest building points is less than a predefined threshold .
is set to 0.1 m according to [37], and building extraction results after restricted region growing are shown in Figure  4b. It should be noted that ground, building and non-building points are rendered by blue, red, and white respectively in subsequent sections.

Maximum Intersection Angle Constraint
After the above-mentioned steps, some non-building points or clusters belonging to vegetation with flat surface, or vehicles with smooth surface and small size are wrongly classified as buildings. Area-based strategies are commonly adopted to remove these above non-building points by clustering based on the assumption that buildings generally occupy a specific area [12,23,27]. Although satisfying results can be obtained by setting proper area thresholds, they fail to eliminate non-building points in some scenes. Figure 5 illustrates building extraction results in Area 3 of Vaihingen [49] after the above steps from LiDAR data with an average point density of 3.2 points/m 2 . Building points are separated into two clusters via Euclidean clustering, in which one is located in green and the other in the yellow rectangle. The number of points of the small cluster in the yellow rectangle is more than 110, which means the cluster occupies approximate 35 m 2 . But in practice, the area of many buildings is less than that and they will be eliminated via area-based methods if the area threshold is set to 35 m 2 . To solve this issue, the maximum intersection angle constraint is proposed.  Figure 5c illustrates the concept of intersection angle, which is composed of the current building point and its cylindrical neighborhoods searched from non-building and ground points. Due to the fact that buildings occupy a specific area and the façade of buildings act as barriers to prevent ground points falling inside the building area, the maximum intersection angle of real building points is larger than 90° at the building corners, and larger than 180° away from building corner. While, the maximum intersection angle of vegetation points or other non-building points is less than 90° due to the fact that it is surrounded by ground and non-building points in all directions within a cylinder (yellow circle in Figure 5c) [58]. Figure 5d shows the calculation of the maximum intersection angle of a given current building point O and its cylindrical nearest neighborhoods. The calculation is composed of three main steps: Step 1: select initial direction ⃗ and calculate rotational angle with respect to ⃗ using their x, y coordinates for each nearest point.
Step 2: sort above angles in ascending order, and intersection angle between adjacent rotational angles is calculated as follows: Step 3: the maximum is selected as the maximum intersection angle of point O. For a current building point O, if its maximum intersection angle is larger than the pre-defined threshold, then O is classified as a building point. Otherwise, it is a non-building point. If there are neither non-building points nor ground points falling in the cylinder of O, then it is classified as buildings directly.
The maximum intersection angle constraint takes into account two threshold parameters: radius of cylinder to detect neighborhoods from ground and non-building points and angular threshold to consider the minimum angle defined by the façade alignments at the corners. In the proposed method, the radius is empirically set to 2.5 m according to [12,37] and the angular threshold was set to 90° according to [58].
It should be noted that generally the width of a flat overpass ranges between 2.5 m and 3.5 m [59], thus flat overpasses can be excluded from the detected building points under the constraint of the maximum intersection angle using above empirical thresholds.

Consistency Constraint
Although most buildings are extracted after the above three steps, some special building points are omitted and some non-building points are wrongly classified as buildings. Figure 6a illustrates a roof terrace with some attachments in Vaihingen [49], such as tables, chairs, and small vegetation on it. It is obvious that points in this area are rough, while the building surface is flat, as shown in Figure  6c. As a result, some building points fail to be detected (hereinafter referred to as undetected building points). Figure 7 shows two trees with dense leaves and the top of one is flat. Consequently, points falling in the region are wrongly classified as buildings (hereinafter referred to as false building points).
It should be noted that these undetected building points (false building points) are surrounded by building points (non-building points). Therefore, consistency constraint is proposed to solve the problem.  Figure 8 illustrates search path in four directions. Where cell is rendered by red, a yellow arrow with the same direction shows the search path in this direction, the last cell in a direction is colored by magenta, other cells are render by green. The is set as twice the average point space according to [12,37]. Finally, 4 non-empty cells are obtained. If there are non-building points and no ground points in these 4 cells, then ( , ) belongs to non-building points. Otherwise, it belongs to buildings. Figure 6d illustrates the undetected building points are extracted and Figure 7d shows the false building points are removed, avoiding inhomogeneity in the results of building extraction.

Experimental Results and Analysis
To validate the proposed method, datasets provided by the International Society for Photogrammetry and Remote Sensing (ISPRS) in Vaihingen and Toronto [49], one dataset in New Zealand [60] and one dataset in the state of Utah [61], were used in the experiments. The results were displayed in LiDAR_Suite, an airborne LiDAR data processing software developed by the Research and Development (R&D) group of the authors.

Data Description
The ISPRS dataset is composed of five reference areas: Area 1 to 3 in Vaihingen and Area 4 to 5 in Toronto. The average point density of Area 1 to 3 is approximately 4-7 points/m 2 , and it is about 6 points/m 2 in Area 4 to 5. It should be noted that three test areas (i.e., Area 1 to 3) are located in residential regions and other two test areas (i.e., Area 4 to 5) are within commercial zones, as shown in Figure 9a-  Area 1: includes 37 buildings, mainly composed of dense historic buildings having rather complex shapes along with roads and trees.
Area 2: includes 14 buildings, mainly composed of a few high regular residence buildings with horizontal roofs.
Area 3: includes 56 buildings, mainly composed of several detached buildings with simple structured roofs and vegetation along roads. Area 4: includes 58 buildings, mainly composed of low and high-storey buildings with complex roof structure and rooftop attachments.
Area 5: includes 38 buildings, including a cluster of high-rise buildings with diverse roof structure, complex shapes and rooftop attachments.

Results and Analysis
Initial results of building extraction based on graph cuts algorithm and height constraint are shown in Figure 10. It can be seen that most buildings are successfully extracted in Area 1 to 5 and almost no inhomogeneity exists for individual buildings in Area 1 to 3 where there are almost no irregular complex attachments on roofs, due to the fact the proposed method considers spatial neighborhood information, which helps to exploit contextual information and improve classification accuracy [41,62]. The satisfying initial results benefit post-processing work. However, small clusters or individual points belonging to vegetation or small objects, still exist, as shown in the green rectangles in Figure 10b,c. Also, few points inside the buildings fail to be extracted, as shown in green rectangles in Figure 10a. For qualitative evaluation, four indicators: completeness CP (%), correctness CR (%), quality Q (%) and -score (%) metrics of the results on per-area and per-object levels are evaluated [37,[63][64][65]. Completeness represents the percentage of correctly detected buildings to total number of reference buildings. Completeness denotes the percentage of correctly detected buildings to total number of detected buildings. Quality and -score provide a compound performance metric that balances completeness and correctness. Equations are described as follows: where is the number of correctly detected buildings, is the number of omitted buildings, and is the number of wrongly detected buildings. The quantity evaluation results of Area 1 to 5 are listed in Table 1. To compare the performances of different methods to extract buildings from Area 1 to 3, 20 methods [12,23,37,49] that solely use LiDAR data are selected and the average results are listed in Table 2, where the ID "WHU_TQ" refers to the proposed method. For Area 4 to 5, 11 methods [23,49] are selected and compared with the proposed method, listed in Table 3. Figure 11 demonstrates the extraction results in the pixel against the reference ground truth data.
From Tables 1-3, we find that satisfying results on a per-area and per-object level are obtained by the proposed method. The CP metric at the area level is important because it is directly related to the post-editing work and subsequent building modeling. At this level, 95.5% and 98.4% of average CP in Vaihingen and Toronto are obtained respectively, outperforming other comparison methods in Tables 2 and 3 respectively, which means buildings are more easily recognized by the proposed method. In addition, differences in Q between these five areas are minimal at the area level, which demonstrates the stability of the proposed method. Although the average Q metric of Area 1 to 3 on a per-area level is less than some methods, such as [12,23,37], there were some defects for them. Contextual information was not considered in [23,37], while existing research works have demonstrated that contextual information is beneficial to data processing [12,15,24]. As a result, the initial extraction results of [37] are not as good as ours due to lack of contextual information, which increase post-processing work. In [23], some inhomogeneity occurred in the process of extracting complex buildings. Moreover, the number of point features used in the proposed method is less than [12,23,37], which benefits building extraction by decreasing computation cost of calculating point features and improving algorithmic efficiency. For Area 4 to 5 datasets, the average Q metrics of the best method slightly outperforms the proposed method at the area level. This is mainly because there are several non-building objects with flat surfaces and large size, and they are wrongly classified as buildings. But the CP of the proposed method is obviously higher than other methods at the per-area level.
From Tables 2 and 3, it can be seen that scores of four metrics are different for the same method, indicating different methods should be chosen merely according to different metrics. In fact, the best method should be application-oriented. For example, building with floor area larger than 50 m 2 are more important in urban planning and reconstruction [1,5], thus it is reasonable to choose [12] to extract buildings from Area 1 to 3 datasets. Efficiency of a method needs to be considered in practice as well.   From Figure 11, most buildings can be correctly extracted by the proposed method. Although some vegetation with smooth surfaces are wrongly classified as buildings (green rectangle in Figure  10b), they are eliminated under the consistency constraint. Also, there are lots of long greenbelts vegetation area with the same height above 1.5 m, and points of these long greenbelts are classified as buildings due to their flat surfaces (shown in green rectangle in Figure 10c). These are eliminated using the maximum intersection angle constraint. However, some building attachments are omitted (hereinafter referred to as false negative errors) and some non-building points are wrongly classified as buildings (hereinafter referred to as false positive errors).
False negative errors can be explained by four main reasons: (1) Complex structure of buildings. In Figure 11f, there is a skylight (about 3.0 m * 4.0 m) and a roof terrace (about 2.5 m * 3.0 m) containing some small attachments (seems to be chairs, bed). As a result, points in these regions are rough and it is hard to effectively extracted buildings, which is a common problem for many methods [12,23,37]. However, if the attachments locate in large roof terrace, then the roof terrace can be extracted under consistency constraint, which is demonstrated in the green rectangle region in Figure  11g. The phenomenon also occurs in Area 5, shown in green circle in Figure 11e. (2) Occlusion by vegetation. The blue rectangle in Figure 11g is a building roof, and partial buildings are occluded. As a result, partially non-occluded buildings are extracted successfully. When a small building is partially occluded by vegetation, then the whole individual building is omitted. (3) Building with low height. Generally, a building should be high enough for people going in and out. However, in Figure 11i, the maximal absolute height difference between building and ground is about 1.0 m, less than 1.5 m, and many building points' height are almost equal to ground. In [12,41], points in Figure  11i were wrongly classified as ground due to the above reasons. In this case, buildings failed to be extracted. Actually, without height constraint, points in this region are extracted in our experiments. The phenomenon also occurs in Area 4, shown in green rectangle in Figure 11d. (4) Data missing. Possibly due to the building roof material shown in Figure 11j, partial point cloud data of building are collected. Thus, partial buildings are extracted by the proposed method. A possible solution to solve the problem is to fuse LiDAR data and images.
Conversely, false positive errors occur for two main reasons: (1) Surrounding vegetation with height similar to building and smooth surface. Due to the fact that graph cuts algorithm considers the neighborhood relationship based on feature difference and the features are calculated using neighborhoods, then building regions are easy to overspread to partial vegetation areas, shown in Figure 11h. Although the problem can be solved by decreasing smooth term, completeness will decrease for lack of neighborhood information. (2) Non-building objects similar to buildings. Figure  11k illustrates a non-building object with flat surfaces and large size, and it is hard to discriminate them from buildings merely according to commonly used constraints, such as height constraint, area constraint, and the maximum intersection angle constraint. As a result, it is wrongly classified as building. This phenomenon also occurs in Area 5 in green rectangle in Figure 11e. A possible solution to this problem is the fusion of LiDAR data and other data sources.

Data Description
Two datasets with different point density are used, one of which is captured in New Zealand  New Zealand dataset: includes two buildings with curved roofs and several large connected complex buildings.
Utah dataset: includes dense residential buildings with significantly different sizes, shapes and structures surrounded by vegetation.

Results and Analysis
New Zealand dataset and Utah dataset have been classified into four classes: ground, vegetation, building, and others by using LiDAR_Suite and manual post-processing. Therefore, the obtained buildings are used as truth data in the experiments. Building extraction results at the per-pixel level of these two datasets are illustrated in Figure 13, and quantitative evaluation results of building extraction at the area level are shown in Table 4.
In New Zealand dataset, the large complex connected buildings and the buildings with curved roofs are successfully extracted, as shown in Figure 13a,b. It should be noted that the Q (%) in New Zealand dataset is 93.2%, significantly larger than Utah dataset with 88.3%. The reason may include: (1) New Zealand dataset is high-density point data with 20 points/m 2 , and more accurate details can be obtained when point density increases [47], which is beneficial to building extraction; (2) Buildings are far from vegetation and slightly occluded by vegetation.
In Utah dataset, although the sizes, shapes and structures of buildings are significantly different, most buildings are successfully extracted by the proposed method. However, three buildings (green rectangles in Figure 13c) fail to be extracted completely, due to two main possible reasons: (1) Data missing. Possibly due to the building roof material, partial building points of these three buildings are collected, as shown in Figure 12b. As a result, points are rough in these local areas and partial points are classified as buildings and others are not. (2) The maximum intersection angle constraint. Some non-building and building points mix together after performing graph cuts algorithm due to missing data, then these building points are eliminated under the maximum intersection angle constraint due to its calculation theory. Despite the aforementioned problems, buildings with complete data are extracted successfully, as shown in Figure 13c.

Discussion
Point features, parameters of the logistic function to normalize each feature, weight , of each feature in the data term and smooth term are important to the final results. Therefore, discussions about the proposed feature, above parameters setting and the running time of the proposed method are conducted in this section.

Discussion of
In the proposed method, point feature based on variance of LRSC-based normal vector (hereinafter referred to as ), is used to discriminate building points from vegetation. To evaluate its performance, a comparison between and feature based on PCA-based normal vector (hereinafter referred to as ) is conducted using Area 1 to 3 datasets. In the comparison, only or is used to extract buildings in the proposed method. Quality Q (%) metric on a per-area level is used to measure the accuracy of building extraction, shown in Figure 14.
In Figure 14, the accuracy of significantly changes for different compared with in Area 1 to 3, which indicates is more sensitive to the feature threshold parameter . Moreover, Q difference between two features is larger in Area 1 than in Area 2 and 3. This is because Area 1 includes rather complex historic buildings composed of irregular roof planes with different slopes, while roofs of most buildings in Area 2 are horizontal and structures of buildings in Area 3 are simple and regular. Moreover, low-rank subspace clustering (LRSC) technique can calculate accurate normal vectors compared with PCA [48]. It indicates perform much better than to extract buildings in complex scenes.

Discussion of Parameters Setting
Parameter of is set to 1.0 according to the results in Figure 14. The Q (%) on a per-area level is used to study the optimal of using Area 1 to 3 datasets. Note that only is used in the proposed method, and Q is shown in Figure 15a. It can be seen that Q is more sensitive to in Area 1 than in the other two areas, possibly due to the complex buildings, and Q reaches the maximum when is set to the proper value for these three areas. According to the average results, when the optimal of is set to 0.06, the highest Q is obtained.   When analyzing the weight parameters , , one is adjusted from 0 to 1 and the other is correspondingly set as from 1 to 0. The metric Q on a per-area level is used to study the optimal and , and the Q is shown in Figure 15b. The highest average Q is obtained when = 0.4 and = 0.6, and accuracy will be improved by combining two features together.

Discussion of Running Time
Experiments of ISPRS datasets (i.e., Area 1 to 5) and other two LiDAR datasets (i.e., New Zealand dataset and Utah dataset) were performed on a laptop computer with 16GB RAM and an Intel Core i7-7700HQ @ 2.8GHz CPU, and a Windows 10 64-bit operating system. The proposed method was implemented using C++ with the platform of Visual Studio 2013. It should be noted that the total running time ( ) of the proposed method is composed of two parts: time ( ) before post-processing (i.e., Step 1 and 2) and time ( ) of post-processing (i.e., Step 3), listed in Table 5.  14  18  20  357  307  70  127  23  51  61  4831  4642  653  2239  37  69  81  5188  4949  723  2366 From Table 5, it can be seen that are significantly less than due to fact that only two point features need to be calculated, and graph cuts and PTD algorithms are efficient [44,45,54]. Therefore, satisfying initial extraction results can be obtained using less time. Through analysis, step of the maximum intersection angle constraint occupies the most time in the step of post-processing. This is because large numbers of neighbors are searched to calculate angles and obtain the maximum intersection angle to eliminate non-building points of small clusters. Area-based method can be used to eliminate these non-building points with high efficiency [12], but it is sensitive to area threshold and fails to eliminate large non-building point clusters with narrow width compared with the maximum intersection angle constraint. A possible solution to the problem is the combination of areabased method and the maximum intersection angle constraint. How to combine these two methods and their potential influence to eliminate non-building points are worth further study. It should be noted that running time of Area 4 and 5 is far more than Area 1 to 3 due to the fact that number of points in Area 4 and 5 is far more than Area 1 to 3, and scenes in Area 4 to 5 are more complex.

Conclusions
Due to the complexity of urban scenes, it is still a challenging task to extract buildings automatically. In the paper, a building extraction method from airborne LiDAR data based on mincut and improved post-processing is proposed. To discriminate building points on the intersecting roof planes from vegetation, a point feature based on variance of normal vectors via low-rank subspace clustering (LRSC) is proposed, and non-ground points are separated into two subsets based on min-cut after filtering. Then, an improved post-processing is adopted to refine building extraction results by using restricted region growing and the constraints of height, the maximum intersection angle, and consistency. Omitted points of buildings located in slopes are detected using the restricted region growing. The proposed maximum intersection angle constraint effectively removes large nonbuilding point clusters with narrow width, such as greenbelt along street, overcoming the defects of area-based methods in setting area threshold. Contextual information and consistency constraint are both used to eliminate inhomogeneity in the process of building extraction. No manual operations are required in the process except predefining some threshold values. Experiments of seven datasets verify that most buildings, even with curved roofs, are successfully extracted by the proposed method. In terms of precision, for Area 1 to 3 in Vaihingen the average Q metrics of the proposed method achieves promising results with 87.7%, 83.7%, and 99.1% for the per-area, per-object, perobject >50 m 2 levels respectively. And for Area 4 to 5 in Toronto, the average Q metrics are 88.9% for the per-area level. The Q metric for the per-area level of the proposed method achieves 93.2% for high-density dataset with average point density 20 points/m 2 . Moreover, the proposed point feature outperforms the comparison alternative and is less sensitive to feature threshold in complex scenes. In addition, only two point features are used in the proposed method, which beneficially decreases the computation cost of calculating point features and improving algorithmic efficiency.
However, some defects still exist in the proposed method. It is still a challenge for the proposed method to successfully extract some buildings attached by complex skylight or roof terrace due to rough points. A feasible solution is to combine images or intensity data to obtain extra features [6,10], which deserves further studies. In addition, there are several parameters that are adopted in the proposed method, which reduces the full automation of building extraction. We will attempt to construct a self-adaptive building extraction algorithm in the next step. Moreover, there are other normalization functions available, such as min-max normalization, Z-score normalization, etc. [66], but only logical function is employed to normalize features. How to introduce other normalization functions to normalize feature and their potential influence to building extraction are worth further research.