Building Point Detection from Vehicle-Borne LiDAR Data Based on Voxel Group and Horizontal Hollow Analysis

Information extraction and three-dimensional (3D) reconstruction of buildings using the vehicle-borne laser scanning (VLS) system is significant for many applications. Extracting LiDAR points, from VLS, belonging to various types of building in large-scale complex urban environments still retains some problems. In this paper, a new technical framework for automatic and efficient building point extraction is proposed, including three main steps: (1) voxel group-based shape recognition; (2) category-oriented merging; and (3) building point identification by horizontal hollow ratio analysis. This article proposes a concept of “voxel group” based on the voxelization of VLS points: each voxel group is composed of several voxels that belong to one single real-world object. Then the shapes of point clouds in each voxel group are recognized and this shape information is utilized to merge voxel group. This article puts forward a characteristic nature of vehicle-borne LiDAR building points, called “horizontal hollow ratio”, for efficient extraction. Experiments are analyzed from two aspects: (1) building-based evaluation for overall experimental area; and (2) point-based evaluation for individual building using the completeness and correctness. The experimental results indicate that the proposed framework is effective for the extraction of LiDAR points belonging to various types of buildings in large-scale complex urban environments.


Introduction
With the rise of the "smart city" concept, automatic information extraction and three-dimensional (3D) reconstruction of buildings in urban areas have become popular research topics in the fields of photogrammetry and remote sensing, computer vision, etc. [1].The results of related studies have been widely used in fields such as intelligent building [2], virtual roaming [3], ancient architecture conservation [4], disaster management [5], and others.A laser scanning system has the unique advantage of providing data of directly measured 3D points in building extraction and 3D reconstruction [6].According to its carrying platform, a laser scanning system can be divided into satellite-based laser scanning, airborne laser scanning (ALS), vehicle-borne laser scanning (VLS), and terrestrial laser scanning (TLS) systems [7].Among these types, ALS has received considerable attention for applications in urban regions [8], and various methods have been proposed for 3D building modeling [9][10][11][12][13][14][15], change detection [16,17] and information extraction in roads [18,19], parking lots [20], buildings [21][22][23] and land cover [24].Moreover, the fusion of ALS data and optical image expands the scope of application and makes up some defects of the ALS data [25].
Compared to ALS, VLS can scan the facades of buildings and obtain denser point clouds with higher accuracy [18].VLS is therefore suitable for 3D reconstruction of building facades.However, highly dense point clouds, with a huge amount of data, means more redundancies and occlusions.Strong variations of point densities occur on the building's surface, which are caused by the viewpoint limitation of the scanner [26,27].Due to these factors, it is still a technical challenge to accurately and quickly extract building point information from VLS data.
Some scholars employ a bottom-up approach for building information extraction from VLS data.Manandhar et al. [28], for example, segmented point clouds into separate scan lines to distinguish artificial and natural objects, according to the differences of geometric properties or spatial distribution of the points on each scan line.The scan lines were then classified as artificial objects, which were combined by prior knowledge for extracting buildings.A few scholars have aimed to identify specific geometries from point clouds-typically flat or curved surfaces-as a way to extract buildings.Common methods have included the random sample consensus (RANSAC) algorithm and Hough transform [29][30][31][32].Moosmann et al. [33] proposed a region-growing method based on local curvature criterion to quickly divide points into plane.Munoz et al. [34] classified point clouds as pole, scatter or facade using the associative Markov network.Pu et al. [18] proposed a knowledge-based feature recognition method to recognize planes and poles from point clouds as basic structures for extracting building facades or street lamps.However, the above methods did not consider that the strong variation of the point densities will cause significant changes of accuracy.In order to solve this problem, Demantké et al. [35] presented an approach for recognizing shape of each point based on Principal Component Analysis (PCA) and adopted two radius-selection criteria called entropy feature and the similarity index for choosing the best neighborhood scale.Similarly, a procedure was introduced by Yang et al. [36] that include Support Vector Machine (SVM), which can divide point clouds into three main classes: linear, planar, and spherical.On this basis, Yang tried to identify complex building facades and other objects with semantic knowledge [37].
In addition, some scholars have adopted a top-down approach to extracting building information from mobile LiDAR data.Li et al. [38] for example, created point clouds projected onto a two-dimensional (2D) plane grid; according to the density and other information, they then extracted the building contour.Aijazi et al. [39] proposed a super-voxel-based approach to segment discrete point clouds.They additionally suggested use of the link-chain method to merge super-voxels into individual objects; the point clouds are then classified as buildings, roads, pole-like objects, vehicles, or trees according to the direction of the surface normal, reflection intensity, and geometry properties.Furthermore, Yang et al. [40] generated a geo-referenced feature image from point clouds, they then adopted discrete discriminant analysis to segment point clouds into separate objects for extracting buildings and trees.This method was likewise used to extract building footprints [1].Among these methods, high precision building extraction results often depends on the points cloud segmentation results, which are always difficult to be obtained in extremely complex scene.
Despite that, many related researches have been reported, and further studies are still urgent as to how to automatically and efficiently extract various types of building points [41], especially in large-scale complex environments of urban regions, such as skyscrapers, low cottages, ordinary residences, or stadiums.Different types of urban areas, such as commercial and residential districts, require different semantic rules, or parameters and thresholds, which are heavily dependent on user experience.In addition, for most of these methods, each LiDAR point involved in the calculation is to increase the processing time.
To address these problems, this paper proposes an approach on automatically extracting various types of buildings, from vehicle-borne laser scanning data acquired from the large-scale complex environments of urban regions (see Figure 1).First, a new structure, called "voxel group", is applied to Remote Sens. 2016, 8, 419 3 of 25 further organize voxels based on the voxelization of VLS points.The method will process each voxel group instead of discrete point clouds to accelerate the calculation.Then a simple way is applied to quickly recognize the shape of point clouds in each voxel group.Second, a category-oriented merging is used for merging voxel group by utilizing the shape information to obtain high precision point clouds clusters.At last, a novel characteristic nature of vehicle-borne LiDAR building points, called "horizontal hollow ratio" for efficient extraction of various forms of buildings.
Remote Sens. 2016, 8, 419 applied to quickly recognize the shape of point clouds in each voxel group.Second, a categoryoriented merging is used for merging voxel group by utilizing the shape information to obtain high precision point clouds clusters.At last, a novel characteristic nature of vehicle-borne LiDAR building points, called "horizontal hollow ratio" for efficient extraction of various forms of buildings.

Voxelization
The voxel is represented by a standard cube that records the serial number of the contained LiDAR points.A simple but efficient way to segment point clouds, voxelization has been widely used in the field of forest science.For example, some scholars have segmented the ALS or TLS data with voxels to analyze forest structure [42,43] or to describe the morphological structures of tree canopies [44], determine crown-base height [45], and separate point clouds into individual trees [46].Wu et al. [3] used a voxel-based method to extract street-lining trees from VLS LiDAR data.Wang et al. [47] generated DEM using a voxel-based method.Jwa et al. [48] automatically extracted power lines with a method based on voxelization.Owing to its simplicity and ease of representation both visually and in data structures [49], voxel-based methods have been suitable for model reconstruction by LiDAR data.Park et al. [50], Hosoi et al. [51], and Stoker [52] adopted voxel-based methods to build 3D models of individual trees, Cheng using voxel-based methods for reconstruction of large multilayer interchange bridge [53] and urban power line [54] while some scholars have applied it for building-model reconstruction with several good results [44][45][46][47].

Generating of Voxel Group
Voxel-based methods can transform the extraction of disorderly distributions of discrete points into the filtering of voxels with topological relations [54].This approach is suitable for vehicle LiDAR points.However, when dealing with large-scale urban LiDAR data, the large volume of voxels remains an obstacle for rapid process.Therefore a new structure called the "voxel group" has been put forward to further organize voxels; each voxel group is composed of several voxels that are adjacent to each other and have the same geometric properties, as shown in Figure 2. Voxel group construction is based on the following two assumptions: (1) A series of voxels with the same horizontal coordinates, with elevations adjacent to each other, have a greater possibility of belonging to the same single real-world object, as depicted in Figure 2b; (2) One voxel and the adjacent voxels with small elevation differences are more likely belong to the same object, as shown in Figure 3d.The above two hypotheses are derived from the respective facts that the object's structure has continuity in vertical directions, regardless whether it is an artificial or natural object.

Voxelization
The voxel is represented by a standard cube that records the serial number of the contained LiDAR points.A simple but efficient way to segment point clouds, voxelization has been widely used in the field of forest science.For example, some scholars have segmented the ALS or TLS data with voxels to analyze forest structure [42,43] or to describe the morphological structures of tree canopies [44], determine crown-base height [45], and separate point clouds into individual trees [46].Wu et al. [3] used a voxel-based method to extract street-lining trees from VLS LiDAR data.Wang et al. [47] generated DEM using a voxel-based method.Jwa et al. [48] automatically extracted power lines with a method based on voxelization.Owing to its simplicity and ease of representation both visually and in data structures [49], voxel-based methods have been suitable for model reconstruction by LiDAR data.Park et al. [50], Hosoi et al. [51], and Stoker [52] adopted voxel-based methods to build 3D models of individual trees, Cheng using voxel-based methods for reconstruction of large multilayer interchange bridge [53] and urban power line [54] while some scholars have applied it for building-model reconstruction with several good results [44][45][46][47].

Generating of Voxel Group
Voxel-based methods can transform the extraction of disorderly distributions of discrete points into the filtering of voxels with topological relations [54].This approach is suitable for vehicle LiDAR points.However, when dealing with large-scale urban LiDAR data, the large volume of voxels remains an obstacle for rapid process.Therefore a new structure called the "voxel group" has been put forward to further organize voxels; each voxel group is composed of several voxels that are adjacent to each other and have the same geometric properties, as shown in Figure 2. Voxel group construction is based on the following two assumptions: (1) A series of voxels with the same horizontal coordinates, with elevations adjacent to each other, have a greater possibility of belonging to the same single real-world object, as depicted in Figure 2b; (2) One voxel and the adjacent voxels with small elevation differences are more likely belong to the same object, as shown in Figure 3d.The above two hypotheses are derived from the respective facts that the object's structure has continuity in vertical directions, regardless whether it is an artificial or natural object.The process of establishing a voxel group is as below.
Step 1: Building 3D voxel grid system.Set an appropriate size S to build a regular 3-D voxel grid system.Each LiDAR point is added to each voxel according to its 3D coordinates.The minimum value among all LiDAR point coordinates is the origin of the 3D voxel grid system.
For each LiDAR point, the row, column, and layer number ( , , ) i j k of its corresponding voxel are recorded to construct a two-way index.
Step 2: Dividing voxels in each column (in vertical direction).On account of a series of voxels distributed in the same vertical direction, the group of voxels with the same row and column ( , ) i j may belong to a different target, such as pedestrians or vehicles below the canopy of trees along a street.Therefore, these voxels must be separated to ensure that each voxel group contains only one object's points, as shown in Figure 2c.Accordingly, the elevation difference between voxel ( , , ) i j k V and the voxel above it, + ( , , 1) i j k V should be calculated:  The process of establishing a voxel group is as below.
Step 1: Building 3D voxel grid system.Set an appropriate size S to build a regular 3-D voxel grid system.Each LiDAR point is added to each voxel according to its 3D coordinates.The minimum value among all LiDAR point coordinates px min , y min , z min q is the origin of the 3D voxel grid system.For each LiDAR point, the row, column, and layer number pi, j, kq of its corresponding voxel are recorded to construct a two-way index.
Step 2: Dividing voxels in each column (in vertical direction).On account of a series of voxels distributed in the same vertical direction, the group of voxels with the same row and column pi, jq may belong to a different target, such as pedestrians or vehicles below the canopy of trees along a street.Therefore, these voxels must be separated to ensure that each voxel group contains only one object's points, as shown in Figure 2c.Accordingly, the elevation difference between voxel V pi,j,kq and the voxel above it, V pi,j,k`1q should be calculated: EOP k`1 indicates the maximum value of the LiDAR point elevation contained within voxel V pi,j,k`1q , and EOP k represents the maximum value of the LiDAR point elevation contained within voxel V pi,j,kq .Threshold Ts is to be set and if D pk,k`1q ď T S , then V pi,j,k`1q will join the voxel column, including V pi,j,kq , S " ! . . .v pi,j,nq , . . .v pi,j,kq ) , to form a new voxel column, S " ! . . .v pi,j,nq , . . .v pi,j,kq , v pi,j,k`1q ) .
Step 3: Merging process in horizontal direction for voxel group.A full λ-Schedule algorithm is to be taken to merge the voxel columns in horizontal direction.The full λ-Schedule algorithm [55] was first used to segment SAR images.The segmentation principle is based on the Mumford-Shah energy equation to judge the difference in object attributes and the complexity of the object boundary [11].The merging cost value t i,j of each adjacent voxel column pS i , S j q is calculated as below: pS i , S j q are two adjacent voxel columns in horizontal direction.S A i is the horizontal projection area of voxel column, which is calculated by the defined length l, width w and the number of the horizontal projection grids.S E i is the elevation value of the highest LiDAR point within voxel column.`B `Si , S j ˘˘is the length of the shared boundary consists of two parts: the length in vertical direction and the length in horizontal direction.The details are as below: i Take a simple region growth for whole voxel columns in horizontal direction based on connectivity to get several rough clusters: tC 1 , C 2 , . . ., C n , . ..u. ii Compute all the pairs of adjacent voxel columns within C n and their merging cost value from Equation ( 6) and sort them into a list.iii Merge the pair (S i ,S j ) which own smallest t i,j to form a new voxel column S ij and update the merging cost value.iv Repeat the step ii and step iii until the t i,j exceeds the threshold T End or all the voxel columns within C n into one group.v Repeat the step ii, iii, iv until all clusters are processed.
The proposed method take a connectivity-based region growth as first in Step 3 is because the computational complexity of the full λ-Schedule algorithm is o(mglog 2 (mn))for a 2D image of m ˆn pixels [56].For 3D voxel grid system, the computational complexity will be higher so the origin 3D voxel grid system must be divided into pieces to reduce the amount of involved voxel columns in one process.Finally, all voxel columns are combined into a higher-level structure, the voxel group.The LiDAR points within each voxel group belong to the same single real-world object and have the same geometric properties or shape information.

Shape Recognition of Each Voxel Group
Demantke et al. [35] and Yang et al. [36] used a PCA-based method to identify the shapes of point clouds.They divided whole point clouds into three categories-linear, planar, and spherical and proved that fusion of point clouds shape information can efficiently segment mobile laser-scanning of point clouds of large-scale urban scenes into single objects.PCA is a common method for analyzing the spatial distributions of neighborhoods of points.It results in a set of positive eigenvalues: λ 1 , λ 2 , λ 3 , pλ 1 ą λ 2 ą λ 3 q.Then Demantke et al. [35] proposed the dimensionality features to describe linear (a 1d ), planar (a 2d ) and spherical ((a 3d )) within.V p R .V p R represents the neighboring points of point p with the neighborhood size R.
A proper neighborhood size is the key for good shape recognition results.Demantke et al. [35] proposed an entropy function that equations the dimensionality of features derived from the eigenvalues of each point: When E f pV p R q achieve the minimum value, V p R have the most possibility belong to one dimensionality feature: To acquire the optimal neighborhood size of each point, an initial neighborhood size value must be set; it is then gradually increased until the entropy function attains the minimum value.Yang et al. [36] fused the intensity of each point in the process of determining the best neighborhood size to improve the accuracy of estimation of shape features.Both two methods required several calculations of eigenvectors and eigenvalues for each point when the neighborhood size changes; therefore, it is time-consuming and not suitable for handling large-scale urban LiDAR data, even though it can achieve a high accuracy.
Unlike the work of Demantke et al. [35] and Yang et al. [36], this article proposes a simple and rapid approach that takes advantage of the voxel group concept for estimation of shape features.The flowchart is shown in Figure 4.Each voxel group is taken as a unit to be estimated its shape.To speed up the process of shape estimation, a part of points in a voxel group is selected instead of its whole points.Because each voxel group may contain point clouds of one tiny single real-world object or a local part of one large single real-world object, the points within one voxel group should have same geometric properties or shape features.The detailed procedure for shape features identifying can be described as follows.
Then the dimensionality feature of opt R p V can be identified by Equation (7).The voxel group and the whole points within it will be labeled the same dimensionality feature.

No
Step 1 Find the center voxel Step 2 Step 3 Step 1: Finding the center voxel.The point density of each voxel within one voxel group is calculated and finds the most dense voxel V md .Calculate the center coordinate of points in this voxel: Step 2: Determine the variation range of neighborhood size.Centering on `X, Y, Z ˘, the minimum neighborhood size R min is determined as the radius that includes the minimal number N p of points required for PCA.Set the increment R i , the neighborhood size will increase until the radius reach the boundary of voxel group.Then the variation range of neighborhood size rR min , R max s is obtained.
Step 3: Calculate the dimensionality features and entropy feature.Then the dimensionality features a 1d , a 2d , a 3d and entropy feature E f pV p r q within V R p (R P rR min , R max s) are calculated by the Equation ( 9).In this paper, P denotes the center coordinate of points in the selected voxel.Then the optimal neighborhood size can be obtained: Then the dimensionality feature of V p R opt can be identified by Equation (7).The voxel group and the whole points within it will be labeled the same dimensionality feature.Every voxel group and point clouds within it are divided into three shape categories: linear, planar and spherical (see Figure 5d) by the Step 1-3 above.The principal direction of point clouds in a linear voxel group, the surface normal direction of point clouds in a planar voxel group, and the coordinates of center point clouds in a spherical voxel group can be further obtained.

Category-Oriented Merging
As noted, each voxel group and the point clouds within it are divided into one shape category: linear, planar, or spherical.The proposed method merges discrete voxel groups into a single realworld object on the ground fusion of the shape information for building detection.This process includes two steps: removing ground points and category-oriented merging.

Removing Ground Points
When addressing outdoor 3D data, an estimate of the ground plane provides an important contextual cue [57].Particularly in large-scale urban regions, advance ground point removal can greatly reduce the amount of data and improve efficiency.A simple strategy is used to quickly filter ground points based on the establishment of the voxel group and shape recognition.The steps of this strategy are outlined below.
Step 1: Extracting the potential voxel group that contains ground points.The difference value between the lowest and highest points is calculated for each planar voxel group with an angle between the surface normal vector and horizontal plane that is greater than 85°: = − max max min h h D P P (10) max h P is the elevation value of the highest point and is the elevation value of the lowest point.
Step 2: Combining the connected region.If the elevation difference between two adjacent candidate voxel groups contains ground points less than 0.3 m, then the two voxel groups are merged.Repeat this process and calculate the area of the final combined voxel group:

Category-Oriented Merging
As noted, each voxel group and the point clouds within it are divided into one shape category: linear, planar, or spherical.The proposed method merges discrete voxel groups into a single real-world object on the ground fusion of the shape information for building detection.This process includes two steps: removing ground points and category-oriented merging.

Removing Ground Points
When addressing outdoor 3D data, an estimate of the ground plane provides an important contextual cue [57].Particularly in large-scale urban regions, advance ground point removal can greatly reduce the amount of data and improve efficiency.A simple strategy is used to quickly filter ground points based on the establishment of the voxel group and shape recognition.The steps of this strategy are outlined below.
Step 1: Extracting the potential voxel group that contains ground points.The difference value between the lowest and highest points is calculated for each planar voxel group with an angle between the surface normal vector and horizontal plane that is greater than 85 ˝: P hmax is the elevation value of the highest point and P hmin is the elevation value of the lowest point.
Remote Sens. 2016, 8, 419 10 of 25 Step 2: Combining the connected region.If the elevation difference between two adjacent candidate voxel groups contains ground points less than 0.3 m, then the two voxel groups are merged.Repeat this process and calculate the area of the final combined voxel group: where N is the number of voxels within the combined voxel group.
Step 3: Removing ground points.Set the area threshold 10 m 2 to filter the too small combined voxel group.Then all candidate voxel groups' average elevation are recorded and the outliers are rejected, always indicating the suspended flat roof.All the points within the rest of candidate voxel groups will be labeled as ground points and need to be removed before the next step.

Category-Oriented Merging
A single real-world object may consist of more than one shape, such as a building composed of a planar roof and linear columns.Therefore, different rules are set for merging adjacent voxel groups containing non-ground points that have the same or different shape properties based on a region growing algorithm.
There come several combinations according to the type of the candidate voxel group: (1) two linear voxel group; (2) two planar voxel group; (3) two spherical voxel group; (4) one linear and one planar voxel group; (5) one linear and one spherical voxel group.Generally, components of artificial objects are approximate parallel or perpendicular to each other, such as palisade tissue, billboard and its stanchion, traffic sign's cross arm and upright, building's flashing and facade.When dealing with the combination 1, 2, 4, the principal direction differences or normal vector differences between the two linear voxel groups or two planar voxel groups and the angle differences between the principal direction of the linear voxel group and the normal vector of the planar voxel group firstly.Two parallel candidate voxel groups (differences smaller than 10 ˝) will be required corresponding judging conditions different from the two candidate voxel groups perpendicular to each other.The merging rules are shown in Table 1.The merging result of testing area without ground points as shown in Figure 6.As can be seen from Figure 6, voxel groups of one single real-world object with same or different shape can merge together.For example, Figure 6b shows that the parallel linear voxel groups, linear voxel groups perpendicular to each other and the planar voxel groups representing of one bus stop's components can merge well.Figure 5d shows that two planar voxel groups perpendicular to each other representing of one building's corner can merge together.Figure 6e shows that one linear voxel group representing of one tree's trunk and several spherical voxel groups representing of tree's canopy can merge well also.
Remote Sens. 2016, 8,419 The merging result of testing area without ground points as shown in Figure 6.As can be seen from Figure 6, voxel groups of one single real-world object with same or different shape can merge together.For example, Figure 6b shows that the parallel linear voxel groups, linear voxel groups perpendicular to each other and the planar voxel groups representing of one bus stop's components can merge well.Figure 5d shows that two planar voxel groups perpendicular to each other representing of one building's corner can merge together.Figure 6e shows that one linear voxel group representing of one tree's trunk and several spherical voxel groups representing of tree's canopy can merge well also.

Horizontal Hollow Ratio-Based Building Point Identification
The laser scanner can give rich surface information of objects but couldn't penetrate the surface.Due to that fact, Aljumaily et al. [58] found out that each building appears as a deep hole by viewing the bottom view of the DEM generated by ALS data.In addition, he extracted building using this character.It is indicated that the defects of LiDAR data can be transformed into some advantages that can be utilized.VLS building point clouds primarily remain on the facade but are lacking in the top and inner structures because of scanner angle constraints and the resistance of building materials.Compared to objects commonly found in urban environments, such as vehicles or street trees, the ratio that the building point cloud projection area occupies, which is surrounded by the contour, is relatively small.In other words, from top view, building point clouds are more "hollow" than other object point clouds, as shown in Figure 7.
As shown in Figure 7, a patch with a different color represents the range of particular a convex hull generated by each segment's point clouds.It can be directly found out that the area occupied by building's point clouds is much smaller than the area of convex hull while tree and car have the

Horizontal Hollow Ratio-Based Building Point Identification
The laser scanner can give rich surface information of objects but couldn't penetrate the surface.Due to that fact, Aljumaily et al. [58] found out that each building appears as a deep hole by viewing the bottom view of the DEM generated by ALS data.In addition, he extracted building using this character.It is indicated that the defects of LiDAR data can be transformed into some advantages that can be utilized.VLS building point clouds primarily remain on the facade but are lacking in the top and inner structures because of scanner angle constraints and the resistance of building materials.Compared to objects commonly found in urban environments, such as vehicles or street trees, the ratio that the building point cloud projection area occupies, which is surrounded by the contour, is relatively small.In other words, from top view, building point clouds are more "hollow" than other object point clouds, as shown in Figure 7.
opposite situation.For a more accurate analysis, the horizontal hollow ratio of every object is presented in Figure 8 to compose a graph as follows.As shown in Figure 7, a patch with a different color represents the range of particular a convex hull generated by each segment's point clouds.It can be directly found out that the area occupied by building's point clouds is much smaller than the area of convex hull while tree and car have the opposite situation.For a more accurate analysis, the horizontal hollow ratio of every object is presented in Figure 8 to compose a graph as follows.
Figure 8. Horizontal hollow ratios of buildings, cars, and trees in Figure 6 (one point represents one object in Figure 6).X-axis (NACH Ratio) represents the normalization ratio between the area of each object's convex hull and the maximum convex hull area common to this type of object.This ratio indicates the morphological changes of objects.The Y-axis represents the horizontal hollow ratio.The horizontal hollow ratio of tree is very close to the horizontal hollow ratio of car while building's horizontal hollow ratio is significantly smaller than that of the other two kinds.A detailed procedure for building points extraction based on horizontal hollow ratio can be described as follows.
Step 1: Extracting outline.The proposed method makes every segment's voxels project to the horizontal plane to form two-dimensionality grids and employ the simple and efficient method proposed by Yang [1] to extract the contour grid: if eight neighbor grids of one background grid (contains no points) are not all background grid, it will be labeled as contour grid.The aim of this step is to reduce the amount of calculation in next step.
Step 2: Generating convex hull.When get the contour grids of one segment, the convex hull of this segment is calculated by the Graham's Scan method.Furthermore, the convex hull area of this segment can be calculated.
Step 3: Calculating horizontal hollow ratio.The proposed method defines the horizontal hollow ratio of each voxel cluster to indicate the above feature: is the area of this segment's two-dimensionality grids: where is the number of two-dimensionality grids.
Step 4: Calculating threshold.OTSU is an automatic and unsupervised threshold selection method.Based on this method, the optimal threshold selection should be made with the best separation of the two types obtained by the threshold segmentation.The interclass separability criterion is the best statistical difference between class characteristics of maximum or minimum differences within class characteristics [59].The building's hollow ratio is far smaller than that of other objects, as indicated by Figure 7; therefore, using the OTSU method to obtain threshold T of the hollow ratio of the divided building and other objects should achieve a good effect: where L is the maximum value of the hollow ratio among all the voxel clusters, and A P and B P are the probabilities of respective building voxel clusters and non-building voxel clusters.Horizontal hollow ratios of buildings, cars, and trees in Figure 6 (one point represents one object in Figure 6).
X-axis (NACH Ratio) represents the normalization ratio between the area of each object's convex hull and the maximum convex hull area common to this type of object.This ratio indicates the morphological changes of objects.The Y-axis represents the horizontal hollow ratio.The horizontal hollow ratio of tree is very close to the horizontal hollow ratio of car while building's horizontal hollow ratio is significantly smaller than that of the other two kinds.A detailed procedure for building points extraction based on horizontal hollow ratio can be described as follows.
Step 1: Extracting outline.The proposed method makes every segment's voxels project to the horizontal plane to form two-dimensionality grids and employ the simple and efficient method proposed by Yang [1] to extract the contour grid: if eight neighbor grids of one background grid (contains no points) are not all background grid, it will be labeled as contour grid.The aim of this step is to reduce the amount of calculation in next step.
Step 2: Generating convex hull.When get the contour grids of one segment, the convex hull of this segment is calculated by the Graham's Scan method.Furthermore, the convex hull area S C of this segment can be calculated.
Step 3: Calculating horizontal hollow ratio.The proposed method defines the horizontal hollow ratio of each voxel cluster to indicate the above feature: S g is the area of this segment's two-dimensionality grids: where N is the number of two-dimensionality grids.
Step 4: Calculating threshold.OTSU is an automatic and unsupervised threshold selection method.Based on this method, the optimal threshold selection should be made with the best separation of the two types obtained by the threshold segmentation.The interclass separability criterion is the best statistical difference between class characteristics of maximum or minimum differences within class characteristics [59].The building's hollow ratio is far smaller than that of other objects, as indicated by Figure 7; therefore, using the OTSU method to obtain threshold T of the hollow ratio of the divided building and other objects should achieve a good effect: where L is the maximum value of the hollow ratio among all the voxel clusters, and P A and P B are the probabilities of respective building voxel clusters and non-building voxel clusters.ω A and ω B are the average hollow ratio values of building voxel clusters and non-building voxel clusters.Only voxel clusters with the average height of H ě 2.5m and cross-sectional area of Csa ą 3m 2 are used by the OTSU algorithm.If a voxel cluster's hollow ratio is less than threshold T, it is deemed a building voxel cluster and all points within it are considered building points.

Results and Discussion
The algorithm proposed was programmed in C# on Microsoft Visual Studio platform.The hardware system was a computer with 8 GB of RAM and a quad-core 2.40 GHz processor.

Study Area and Experimental Data
The region of the Olympic Sports Center, Jianye District, Nanjing City, China, is chosen as the experimental area (Figure 9).To capture the vehicle LiDAR data, we employed an SSW mobile mapping system, developed by Chinese Academy of Surveying and Mapping, with a 360 ˝scanning scope, a surveying range of 3 to 300 m, reflectance of 80%, and a point frequency of 200,000 points/s.The survey was conducted in 2011 and a topographic map of 1:500 was used for data correction.The overall data were approximately 4 GB, which covered a 1.4 km ˆ0.8 km area with 147,478,200 points.The point density is 270 points/m 2 .Due to the amount of testing data being huge, the proposed method is unable to process it in one time.Therefore, we clip the raw testing data into 12 parts according to the road segments in practice.The experimental region contained both downtown area and urban residential area with a number of commercial and residential architecture.A shopping mall, a skyscraper, an apartment building, and a high-rise office building are the main architecture buildings in the study area.Due to the good road greening, a large number of street trees exist in the study area, which cause strong variation of point densities of building façade.On the other hand, it is sometimes difficult separate the buildings and the trees surround it.The proposed method consists of four parts: Voxel group generating, shape recognition, category-oriented merging, and building points identification.As a few parameters and thresholds are set in each part, the summarization on the setting of the key parameters and threshold is given in Table 2.The setting basis of these thresholds and parameters includes three types: data source, calculation, and empirical.The term "data source" indicates that the values are adjusted according to the experimental data and area.The term "calculation" that the values are calculated automatically.As for the term "empirical", it means that the values are set empirically.In the proposed method, they are always controlled by the voxel size.The value set according to data source can be determined based on the shape property and the distribution of the objects in the experimental area.However, the optimal value set according to empirical must to be determined by several repeated test.This is the main difference among "data source" and "empirical".During the generating of voxel group, we set the size S = 0.5 m to form a regular 3-D voxel grid.The size of the voxel is closely related to the extraction of the building points.Some of the buildings lost part of the facade and other appendants, like the podium building, when the voxel size is too small.If the voxel size chosen is too large, the tree and brush close to the building will be classified to building points incorrectly.Some small and low-rise building will be identified as tree or brush due to the same reason.On the other hand, the smaller the voxel size is, the long the execution time is.The threshold to divide the adjacent voxel in vertical direction is set according to the data source.The threshold to terminate the growth of voxel groups' generating is set according to Chen et al. [11].By this test, 0.85 is the best value.
During the shape recognition of the points in each voxel group, the minimal number of points to initialize the minimum search radius and the increment of the search radius is set empirically.The voxel size is one influential factor of the values.
During the category-oriented merging with voxel groups, voxel groups are merged by utilizing the shape information on several rules.The maximal difference in elevation, maximal minimum Euclidean distance between two voxel groups can be setting according to the data source.The maximal distance between two voxel groups' centers can be set empirically.This value is depending on the voxel size.The bigger the voxel size was set, the greater the value.
During the building point identification, the horizontal hollow ratio of each voxel cluster meets the conditions is calculated.The threshold of the horizontal hollow ratio to identify building points is calculated automatically by the OTSU method.The minimum average height and the minimum cross-sectional area of voxel clusters need to be set according to the data source.These two values are the approximate size of a newsstand, which can be often viewed along the urban street of Nanjing and be considered as the smallest building.

Extraction Results of Building Points
The raw vehicle LiDAR point clouds and aerial orthophotos are used to manually extract the building points.The aerial orthophotos were acquired at the same time period when collecting LiDAR data and they can be matched.The spatial resolution of the aerial orthophotos is 0.2 m and presented in 3 bands red-green-blue (RGB).The extraction results were then used as the validation data.Building point boundaries were connected to determine the building regions as the ground truth, referred to herein as the true building region.Figure 10 illustrates the extraction results of building points.LiDAR points are marked in blue for buildings, red for the ground, and green for the other features.As shown in Figure 10b-d, our method extracted not only high-rise and low-rise buildings, but also buildings with special exteriors or complex structures.Moreover, as shown in Figure 10e, the proposed method recognized buildings with missing parts of facade that were caused by field measurement conditions or equipment factors.The SSW vehicle-borne laser scanning system accomplished the objective of obtaining 192 relatively complete buildings from the experimental area with a total of 235 buildings.A relatively complete building refers to the scanning point cloud of the building with at least two facades.The types of buildings in the study area included low cottages, middle-rise residential buildings, high-rise commercial buildings, and unique structures, such as bell towers and theaters.
Remote Sens. 2016, 8, 419 maximal distance between two voxel groups' centers can be set empirically.This value is depending on the voxel size.The bigger the voxel size was set, the greater the value.
During the building point identification, the horizontal hollow ratio of each voxel cluster meets the conditions is calculated.The threshold of the horizontal hollow ratio to identify building points is calculated automatically by the OTSU method.The minimum average height and the minimum cross-sectional area of voxel clusters need to be set according to the data source.These two values are the approximate size of a newsstand, which can be often viewed along the urban street of Nanjing and be considered as the smallest building.

Extraction Results of Building Points
The raw vehicle LiDAR point clouds and aerial orthophotos are used to manually extract the building points.The aerial orthophotos were acquired at the same time period when collecting LiDAR data and they can be matched.The spatial resolution of the aerial orthophotos is 0.2 m and presented in 3 bands red-green-blue (RGB).The extraction results were then used as the validation data.Building point boundaries were connected to determine the building regions as the ground truth, referred to herein as the true building region.Figure 10 illustrates the extraction results of building points.LiDAR points are marked in blue for buildings, red for the ground, and green for the other features.As shown in Figure 10b-d, our method extracted not only high-rise and low-rise buildings, but also buildings with special exteriors or complex structures.Moreover, as shown in Figure 10e, the proposed method recognized buildings with missing parts of facade that were caused by field measurement conditions or equipment factors.The SSW vehicle-borne laser scanning system accomplished the objective of obtaining 192 relatively complete buildings from the experimental area with a total of 235 buildings.A relatively complete building refers to the scanning point cloud of the building with at least two facades.The types of buildings in the study area included low cottages, middle-rise residential buildings, high-rise commercial buildings, and unique structures, such as bell towers and theaters.(e) Results show that the method could also recognize buildings with sparse LiDAR points or lack of partial structures.

Evaluation of Extraction Accuracy
In this section, the extraction accuracy of the proposed method is validated from two aspects: building-based evaluation for overall experimental area (a whole building taken as an evaluation unit) and point-based evaluation for individual building (a point taken as an evaluation unit).To analyze the proposed algorithm in details, this article divided the buildings into three types: low-(one to two stories), middle-(three to seven stories), and high-rise structures (more than seven stories).
The correctness and completeness of the method are used as indexes for the evaluation, as follows: = + TP Correctness TP FP (16) In the evaluation for overall experimental area, where TP is the number of true buildings, FP is the number of wrong buildings, and FN is the number of mis-detected buildings.The buildings derived by manual operation are taken as reference data.By overlaying one extracted building and the corresponding reference data, the overlapped area of them is calculated.If the ratio of this overlapped area to area of the extracted building is larger than 70%, the extracted building is taken as true one.Otherwise, it is considered as a wrong one.If a building is not be detected by the automatic process, it is considered to be a mis-detected building.
In the evaluation for individual building, where TP is the number of true points belonging to it, FP is the number of wrong points belonging to it, and FN is the number of mis-detected points belonging to it.The extracted building points and true building points are put together.The points of overlapped region are regarded as true points (blue points in Figure 11c); points existing only in the truth regions are deemed misdetections (yellow points in Figure 11c), and points existing only in the extracted regions are considered wrong points (red points in Figure 11c).

Evaluation of Extraction Accuracy
In this section, the extraction accuracy of the proposed method is validated from two aspects: building-based evaluation for overall experimental area (a whole building taken as an evaluation unit) and point-based evaluation for individual building (a point taken as an evaluation unit).To analyze the proposed algorithm in details, this article divided the buildings into three types: low-(one to two stories), middle-(three to seven stories), and high-rise structures (more than seven stories).
The correctness and completeness of the method are used as indexes for the evaluation, as follows: Correctness " TP TP `FP (16) In the evaluation for overall experimental area, where TP is the number of true buildings, FP is the number of wrong buildings, and FN is the number of mis-detected buildings.The buildings derived by manual operation are taken as reference data.By overlaying one extracted building and the corresponding reference data, the overlapped area of them is calculated.If the ratio of this overlapped area to area of the extracted building is larger than 70%, the extracted building is taken as true one.Otherwise, it is considered as a wrong one.If a building is not be detected by the automatic process, it is considered to be a mis-detected building.
In the evaluation for individual building, where TP is the number of true points belonging to it, FP is the number of wrong points belonging to it, and FN is the number of mis-detected points belonging to it.The extracted building points and true building points are put together.The points of overlapped region are regarded as true points (blue points in Figure 11c); points existing only in the truth regions are deemed misdetections (yellow points in Figure 11c), and points existing only in the extracted regions are considered wrong points (red points in Figure 11c).

Building-Based Evaluation for Overall Experimental Area
There are 192 buildings involved in the evaluation in this section, including 42 low-rise buildings, 126 medium-rise buildings, and 24 high-rise buildings.
As the height of the building increased, the completeness of extraction results increased; the high-rise buildings' completeness reached 100%, whereas that of the low-rise buildings reached only 86.3%.This was due to the low-rise buildings' top and internal structures being more apt to scanning by the mobile laser scanner; therefore, the low-rise buildings' hollow ratios calculated by the algorithm reached lower values than those of the high-rise buildings.Hence, if a very tall building and a low-rise building were in the same region, the low-rise building with a low hollow ratio may have been incorrectly identified as a vehicle, bush, or other object.The overall completeness and correctness were 94.7% and 91%, respectively.As a comparison, one automatic building detection method reported by Truong-Hong and Laefer [41] conducted in the similarly dense urban area achieved 95.1% completeness and 67.7% correctness from ALS data.
Two primary types of incorrect detections occurred.Firstly, because some vehicles were in highspeed motion, these vehicles' point clouds were stretched, or adjacent vehicles' point clouds may have merged, which caused them to be incorrectly detected as buildings owing to the high hollow ratio values they incurred.Secondly, because their shapes can be similar to building facades, high fences resulted in high hollow ratios and they were consequently incorrectly detected as buildings.

Building-Based Evaluation for Overall Experimental Area
There are 192 buildings involved in the evaluation in this section, including 42 low-rise buildings, 126 medium-rise buildings, and 24 high-rise buildings.
As the height of the building increased, the completeness of extraction results increased; the high-rise buildings' completeness reached 100%, whereas that of the low-rise buildings reached only 86.3%.This was due to the low-rise buildings' top and internal structures being more apt to scanning by the mobile laser scanner; therefore, the low-rise buildings' hollow ratios calculated by the algorithm reached lower values than those of the high-rise buildings.Hence, if a very tall building and a low-rise building were in the same region, the low-rise building with a low hollow ratio may have been incorrectly identified as a vehicle, bush, or other object.The overall completeness and correctness were 94.7% and 91%, respectively.As a comparison, one automatic building detection method reported by Truong-Hong and Laefer [41] conducted in the similarly dense urban area achieved 95.1% completeness and 67.7% correctness from ALS data.
Two primary types of incorrect detections occurred.Firstly, because some vehicles were in high-speed motion, these vehicles' point clouds were stretched, or adjacent vehicles' point clouds may have merged, which caused them to be incorrectly detected as buildings owing to the high hollow ratio values they incurred.Secondly, because their shapes can be similar to building facades, high fences resulted in high hollow ratios and they were consequently incorrectly detected as buildings.

Point-Based Evaluation for Individual Building
In this evaluation, to further evaluate the effect of the proposed method, complex buildings are entered into a separate category (a new type of buildings), named complex buildings in Table 2.The individual evaluation results in Table 3 show that the middle-rise buildings have better extraction results than the low-rise and high-rise buildings.The average completeness and correctness of the middle-rise buildings are 95% and 95.7%, respectively.The low-rise buildings yield the worst extraction results, with a completeness and correctness of 94.8% and 93.1%, respectively.These latter results are due to the relative complexity of the vicinities of low-rise buildings; the proposed method employs the voxel group to segment the point clouds.This combined some of the buildings' point clouds and those of adjacent bushes, trees, and other objects into one group, which reduced the accuracy.The high-rise buildings demonstrate the highest correctness and lowest completeness at 99.4% and 91%, respectively.This is due to the limitation of the scanner angle; i.e., the vehicle-borne laser scanning system was unable to obtain the point clouds of upper stories and facades that did not face the road.This prevented some of the building point clouds being added to the main part of the building in regional growing process.On the other hand, for high-rise buildings, such as commercial and office buildings, the surrounding environments were relatively simple; objects adjacent to these buildings had very distinct elevations and shapes.Therefore, the high-rise building extraction results had the lowest completeness but highest correctness.
The complex buildings have an irregular shape and complicated structure, so it is hard to identify all the parts of a complex building.The complex building's average completeness is 91.9%, on the whole in a satisfactory level.However, from the ten chosen complex building, it is can be found find that the completeness of each building has a large span, from 73.8% to 97.3%, which indicate that the method may face a challenge when dealing with some special building.

Experiment Discussion
Overall, the detection results in Figure 10 and the evaluation outcomes showed that the proposed method can work well in dense urban areas.The voxels generated by the raw point clouds were divided into voxel columns.A full λ-Schedule algorithm, which considers the complexity of the object boundary, was used for the voxel columns merging and an optimal regularization parameter can be drawn.As a contrast, the termination rules of a 3D region growing approach are difficult to determine.Although the 3D region growing method can construct voxel groups more efficient but the results are always over-segmentation.Five combinations based on two voxel groups' shapes were presented.Each combination has its own merging rule and high-precise segmentation results were achieved by region growing.The method have more potential for acquiring a further improvement by using machine learning (e.g., Support Vector Machine, SVM) and fusing with intensity or color information of point clouds.
The proposed method and the method of Yang et al. [37] show some similarity in the strategy of the segmentation of point clouds, so a comparison was undertaken in this section.The method of Yang et al. [37] generates multi-scale supervoxels from VLS data and segments these supervoxels.Then a set rules was defined for merging adjacent segments with semantic knowledge.We divided both the strategy of the segmentation of point clouds of the proposed method and Yang's method into three steps: the point organization, shape recognition and merging.Region 5 (red rectangle in Figure 10a) with 458,259 points is the test area for the comparison between the proposed method and Yang's method.Point clouds are assigned to 1400 voxel groups or 7282 supervoxels.It can be found that the objects with simple structure, like building facade, have greatly fewer voxel groups than supervoxels.There are many more small areas of flat facade of the building were classified into linear or spherical structure incorrectly by Yang's method.This is due to that the voxel group is bigger than supervoxel in size, so the voxel group can contain more complete structure of the single real-world object.The points of the building are divided into one segment by the proposed method while they were separated into several segments by Yang's method.Through the comparison of the two methods' result, the proposed method handles the shape recognition and segmentation of building facade better than the method of Yang et al. [37].Table 4 shows that the proposed method needs less execution time than Yang's method in each step.Figure 12 illustrates the comparative experiments result of building extraction by the proposed method and the method of Yang et al. [37].Region 4 (red rectangle in Figure 9a) is the test area with 5,298,460 points and five buildings.Both methods extract all buildings and no misclassification error has occurred.Figure 12a-c shows that the proposed method can extract more details of the building (see black rectangles), like beam and bay window.Figure 12d,e shows that the proposed method can reduce the error caused by the noise around the building (e.g., bush, tree, and car) and as far as possible to retain the accessory structures of the building at the same time.On the other hand, Yang's method has wrongly identified the bottom of all the extracted buildings as ground points while the proposed method avoids this mistake.
Remote Sens. 2016, 8,419 (see black rectangles), like beam and bay window.Figure 12d,e shows that the proposed method can reduce the error caused by the noise around the building (e.g., bush, tree, and car) and as far as possible to retain the accessory structures of the building at the same time.On the other hand, Yang's method has wrongly identified the bottom of all the extracted buildings as ground points while the proposed method avoids this mistake.

Conclusions
The study proposed an approach for automatic extraction of building points using vehicle-borne laser scanning data.From experimental results, the following points can be drawn.
(1) The proposed approach, including voxel group-based shape recognition, category-oriented merging, and horizontal hollow-based building point identification, can be applied in the various types of buildings, in large-scale and complex urban environments.
(2) The category-oriented merging with voxel group-based shape recognition is effective in improving the accuracy of segmentation and to develop the new voxel group structure for accelerating processing and simplifying shape recognition.
(3) The concept of horizontal hollow ratio for building point cloud identification can accurately extract various forms of buildings, from low cottages and common residential buildings to the towering skyscrapers and remarkable stylish theaters, without requiring complex semantic rules.This point indicated that some characteristics of LiDAR data caused by the constraints of the laser sensor are not obstacles, but some good indicators for particular objects.This is a new concept in the classification of LiDAR data fields.
Our future work will focus on fusing VLS data with ALS data and optical images, such as street images, to obtain more complete and accurate building point detection results.Otherwise, the huge data amount of mobile LiDAR data largely reduces the efficiency of information extraction for specific applications.Future work is to develop high-performance computation technology for mobile LiDAR data processing.

Figure 1 .
Figure 1.Flowchart of building point extraction from VLS point data.

Figure 1 .
Figure 1.Flowchart of building point extraction from VLS point data.

Remote Sens. 2016, 8 , 419 Figure 2 .
Figure 2. Construction of voxel group.(a) Point clouds distribution of several objects in a 3D voxel grid system; (b) Street lamp point clouds and the generated voxels; This is a typical case in which the voxels with the same horizontal and vertical coordinates with adjacent elevations belong to the same target; (c) Schematic of the process of dividing the voxel distributions on the same vertical direction; (d) Profile of part canopy of a street tree, a case that adjacent voxel within points belong to one object have little elevation differences.
represents the maximum value of the LiDAR point elevation contained within

Figure 2 .
Figure 2. Construction of voxel group.(a) Point clouds distribution of several objects in a 3D voxel grid system; (b) Street lamp point clouds and the generated voxels; This is a typical case in which the voxels with the same horizontal and vertical coordinates with adjacent elevations belong to the same target; (c) Schematic of the process of dividing the voxel distributions on the same vertical direction; (d) Profile of part canopy of a street tree, a case that adjacent voxel within points belong to one object have little elevation differences.

Figure 3 .
Figure 3. Flowchart of generating of voxel groups.Figure 3. Flowchart of generating of voxel groups.

Figure 3 .
Figure 3. Flowchart of generating of voxel groups.Figure 3. Flowchart of generating of voxel groups.

Figure 4 .
Figure 4. Flowchart of shape recognition for each voxel group.

Figure 4 .
Figure 4. Flowchart of shape recognition for each voxel group.

Figure 5 .
Figure 5. Voxel group-based shape recognition.(a) Raw LiDAR point clouds include building facades, street trees, street lamps, cars, and the ground; (b) Generated voxel group, voxels of the same color belong to the same voxel group; (c) LiDAR points within each voxel group, points of the same color belong to the same voxel group; (d) Shape recognition results.

Figure 5 .
Figure 5. Voxel group-based shape recognition.(a) Raw LiDAR point clouds include building facades, street trees, street lamps, cars, and the ground; (b) Generated voxel group, voxels of the same color belong to the same voxel group; (c) LiDAR points within each voxel group, points of the same color belong to the same voxel group; (d) Shape recognition results.

Figure 6 .
Figure 6.Category-oriented merging.(a) Merging results, points of the same color belong to the same segment; (b) (c) (d) (e) I: Single real-world object with several voxel groups, points of the same color belong to the same voxel group; II: Shapes of one object, red denotes linear points, green denotes surface points, and blue denotes spherical points.

Figure 6 .
Figure 6.Category-oriented merging.(a) Merging results, points of the same color belong to the same segment; (b-e) I: Single real-world object with several voxel groups, points of the same color belong to the same voxel group; II: Shapes of one object, red denotes linear points, green denotes surface points, and blue denotes spherical points.

Figure 7 .
Figure 7. Horizontal hollow ratio-based building point identification (a-c).Left: top view of segments of point clouds of several buildings, trees and cars.Right: overlay of a convex hull and point clouds of each segment.

Figure 7 .
Figure 7. Horizontal hollow ratio-based building point identification (a-c).Left: top view of segments of point clouds of several buildings, trees and cars.Right: overlay of a convex hull and point clouds of each segment.

Figure 8 .
Figure 8. Horizontal hollow ratios of buildings, cars, and trees in Figure6(one point represents one object in Figure6).

Figure 9 .
Figure 9. Experimental area.(a) Aerial orthophotos of the experimental area, red line denotes the SSW mobile mapping system's driving route; (b) Raw VLS data of the experimental area.

Figure 10 .
Figure 10.Building point extraction results.(a) Extraction results of buildings in the experiment region; (b,c) Proposed method successfully detected various building shapes, including skyscrapers and low cottages; (d) Proposed method effectively separated a building and the trees attached to it; (e) Results show that the method could also recognize buildings with sparse LiDAR points or lack of partial structures.

Figure 11 .
Figure 11.Point-based evaluation for individual building.(a) Automatic extraction results of a building; (b) Manual extraction results of the same building; (c) Overlay result with the correct, error, and missing points denoted in blue, red, and yellow, respectively.

Figure 11 .
Figure 11.Point-based evaluation for individual building.(a) Automatic extraction results of a building; (b) Manual extraction results of the same building; (c) Overlay result with the correct, error, and missing points denoted in blue, red, and yellow, respectively.

Figure 12 .
Figure 12.Comparison of building extraction result between the proposed method and the method of Yang et al.[37].(a-e) Left to right: street image, raw VLS data, the result by the proposed method and the result by Yang's method of the specific building.

Figure 12 .
Figure 12.Comparison of building extraction result between the proposed method and the method of Yang et al.[37].(a-e) Left to right: street image, raw VLS data, the result by the proposed method and the result by Yang's method of the specific building.

Table 1 .
Rule of merging adjacent voxel groups., et s and o s px, y, zq respectively denote the principal direction, surface normal, top elevation, center coordinates, and the radius of the seed voxel group.The bottom-left voxel group is usually selected as the seed voxel group.The , et c , o c px, y, zq denote the principal direction, surface normal, the top elevation and the center coordinates of the candidate voxel group in seed voxel group's neighborhood.S Min represents the minimum euclidean distance between the seed voxel group and candidate voxel group, which indicate whether two candidate voxel group are connect.T e , T o and T md are the corresponding threshold values of each condition.

Table 2 .
The key thresholds and parameters of the proposed approach.

Table 3 .
Completeness and correctness of the extracted building points.

Table 4 .
[37] performance of the proposed method and the method of Yang et al.[37].