Reconstruction of Complex Roof Semantic Structures from 3D Point Clouds Using Local Convexity and Consistency

: Three-dimensional (3D) building models are closely related to human activities in urban environments. Due to the variations in building styles and complexity in roof structures, automatically reconstructing 3D buildings with semantics and topology information still faces big challenges. In this paper, we present an automated modeling approach that can semantically decompose and reconstruct the complex building light detection and ranging (LiDAR) point clouds into simple parametric structures, and each generated structure is an unambiguous roof semantic unit without overlapping planar primitive. The proposed method starts by extracting roof planes using a multi-label energy minimization solution, followed by constructing a roof connection graph associated with proximity, similarity, and consistency attributes. Furthermore, a progressive decomposition and reconstruction algorithm is introduced to generate explicit semantic subparts and hierarchical representation of an isolated building. The proposed approach is performed on two various datasets and compared with the state-of-the-art reconstruction techniques. The experimental modeling results, including the assessment using the International Society for Photogrammetry and Remote Sensing (ISPRS) benchmark LiDAR datasets, demonstrate that the proposed modeling method can efﬁciently decompose complex building models into interpretable semantic structures.


Introduction
Buildings are the most prominent features in an urban environment. Due to its vast application demands, such as solar radiation estimation [1], visibility analysis [2], and disaster management [3]. The three-dimensional (3D) reconstruction and modeling have received intensive attention in city planning, geomatics, architectonics, computer vision, photogrammetry, and remote sensing. The rapid data acquisition technology from the optical image and light detection and ranging (LiDAR) can produce increasingly dense and reliable point clouds, making it possible to automatically reconstruct 3D building models in a large area. During past decades, various 3D modeling approaches in interactive [4] or automatic [5,6] have been proposed to reconstruct 3D building models using the satellite and aerial optical images [7][8][9], LiDAR [5,10], and combined images and LiDAR [11,12], resulting in full 3D or 2.5D building models at the scale of a city [13,14] and an individual building [4,15]. Even though much progress has been made to produce building models better and faster, the reliable and automatic reconstruction of detailed building models remains a challenging issue [16,17]. In particular, the reconstructed purely geometric model is usually a combination of planar patches or a set of polygons, which is difficult for many disciplines such as urban planning and land management to further semantically interpret and edit the types and structures of buildings.
In this work, we present a novel unsupervised approach to reconstruct the point clouds into semantic structures (e.g., dormer, hipped roof), purely using the geometric constraints. Departing from previous studies, our approach can directly recognize and interpret meaningful roof semantic subparts and their hierarchical topology for a compound 3D building, which can be further used to enrich the building model library or construct public training data for supervised learning. The main contributions of this work are twofold: (1) a progressive grouping algorithm is applied to automatically decompose compound buildings into subparts, thereby generating a structured unit block without any independent overlapping elements, (2) a hierarchical topology tree model is introduced, as well as a roof connection graph and its decomposed subgraphs, to simplify the complexity of the building reconstruction.
The remainder of this paper is organized as follows. An overview of the related unsupervised and supervised approaches for building reconstruction is introduced in Section 2. The detailed reconstruction steps are described in the next Section 3. Experimental results and discussions are presented in Sections 4 and 5, respectively. Conclusions including future work are summarized in Section 6.

State-of-the-Art Methods
Over the past few decades, the issue of 3D building reconstruction has received considerable attention probably due to the advancement of photogrammetry and active sensors, producing a wealth of research work on this broad topic. Among these huge varieties of reconstruction methods, we can distinguish two categories for building modeling, unsupervised and supervised methods, which is the first time for such a summary as we know. In this section, the most related literature on 3D building reconstruction using ALS point clouds is discussed.

Unsupervised Methods
Unsupervised building reconstruction has been studied extensively in the fields of city planning, geomatics, architectonics, computer vision, photogrammetry, and remote sensing. These 3D building modeling approaches can be divided into data-driven and model-driven methods. A comparison on the data-driven and model-driven approaches can be found in Haala and Kada [18]. Interested readers are referred to some review literature [17,19]. Due to insufficient input data and complex building types, 3D building reconstruction remains an open problem even if only simple flat roof surfaces are considered [5]. Thus, hybriddriven methods are gradually being concerned, which integrate additional information from both data-and model-driven methods.
For data-driven approaches, also called non-parametric or bottom-up approaches, it assumes that a building is a polyhedral model, which can be directly modeled by geometric information such as the intersection and regularization. It usually starts with the extraction of roof planar patches by region growing [20,21], feature clustering [22,23], model fitting [24,25], and global energy optimization [26][27][28][29], and then assembling these extracted roof planes to form a polygon building model. To improve the shape of the reconstructed 3D models, some regularization rules, such as parallel and perpendicular are often applied, resulting in a compact 3D building polygonal model with roof ridges and boundaries. These data-driven methods [13,[30][31][32] have succeeded in the reconstruction of simple Manhattan-like objects but are unstable in the presence of noisy or incomplete point clouds. In order to sufficiently utilize the prior knowledge for building reconstruction, Zhou and Neumann [32] improved the quality of roof models by discovering the global regularities of the similarities between roof planar patches and roof boundaries, which can significantly reduce the complexity of 3D reconstructions. Poullis [33] developed a complete framework to automatically reconstruct urban building models from point clouds by combining a hierarchical statistical analysis of the data geometric properties and a fast energy minimization process for the boundary extraction and refinement. To generate more detailed roof models, Dehbi et al. [34] propose a novel method for roof reconstruction using active sampling, and it is limited to only dormer types. The main advantage is that it

Supervised Methods
Supervised 3D building reconstruction approaches have gradually attracted widespread attention, especially the emergence of convolutional neural networks (CNN). Similar to the two-dimensional image semantic labeling methods, it assigns the most probable label to each 3D element (e.g., point, planes, roof subparts) using a labeling model learned from a huge number of training data. These labeled semantic features can be classified by a machine learning-based method [55,56]. The random forest [13] and support vector machines [55] are often used to identify the main building components, such as floors, ceilings, and roofs, which can be further assembled into a semantically enriched 3D model. However, the inputs for these methods are the encoded training features derived from local (e.g., surface area, orientation) and contextual (e.g., coplanarity, parallelism) information, which needs to be designed by hand. Recently, the emerging deep learning techniques have reached human-level performance in the domain of computer vision and natural language processing and have gradually been introduced to building reconstruction. Wichmann et al. [57] developed and released an available training dataset named RoofN3D, which can be used to train CNN for different 3D building reconstruction tasks. Axelsson et al. [58] have presented a deep convolutional neural network to automatically classify roof types into ridges and flat roofs, which can further support the generation of a nationwide 3D landscape model. It cannot interpret large buildings with small meaningful subparts automatically. Zhang and Zhang [59] introduced a deep-learning-based approach to successfully reconstruct urban building mesh models at level of detail 2 (LOD2) according to the CityGML specification but cannot interpret the building structures and roof topology. In addition, Yu et al. [60] developed a new fully automatic 3D building reconstruction approach that can generate the LOD1 building models in a large area but cannot generate complicated building structures. Although various supervised solutions have been proposed in recent years to reconstruct buildings with simple roof types, they are still hindered by the lack of public training data, especially the semantic subparts of complex buildings.

Methodology
The framework of the proposed complex building reconstruction approach from 3D point clouds, as shown in Figure 1, encompasses four key components, data preprocessing (Section 3.1), roof plane extraction and semantic labeling (Section 3.2), building semantic decomposition (Section 3.3), and generation of building models (Section 3.4).

Data Preprocessing
During data preprocessing, regions of building blocks are first detected from 3D point clouds. Taking the original point cloud as input, terrain points are firstly separated from non-terrain points using filter approaches like the adaptive TIN [61], cloth simulation filter (CSF) [62], and two-step adaptive extraction [63]. To obtain building and vegetation point clouds, a height threshold (1.5 m-2.5 m) processing was used to identify the high-rise points from the non-terrain points, then the obtained high-rise points can be further used to extract building point cloud using a top-down extraction approach [64]. Moreover, the extraction of building point cloud will benefit from the corresponding image data, as points can be projected back on imagery and cleaned using the normalized difference vegetation index (NDVI) threshold (0.1-0.15). With the extracted building point cloud, a Euclidean clustering method is applied to group the into individual clusters, and the clusters with small area (3-5 m 2 ) are removed as tree clusters. The threshold of a small area is determined by the point density and the minimum number of points per cluster; e.g., if the point density of the point cloud is 4 points/m 2 , and a cluster with 12 points indicate an area of approximate 3 m 2 . After the aforementioned process, these segmented individual buildings can be subsequently reconstructed. It should be emphasized that these parameters (height, NDVI, area threshold) are selected empirically, and the details for these operations and parameters are beyond the scope of this paper.

Data Preprocessing
During data preprocessing, regions of building blocks are first detected from 3D point clouds. Taking the original point cloud as input, terrain points are firstly separated from non-terrain points using filter approaches like the adaptive TIN [61], cloth simulation filter (CSF) [62], and two-step adaptive extraction [63]. To obtain building and vegetation point clouds, a height threshold (1.5 m-2.5 m) processing was used to identify the high-rise points from the non-terrain points, then the obtained high-rise points can be further used to extract building point cloud using a top-down extraction approach [64]. Moreover, the extraction of building point cloud will benefit from the corresponding image data, as points can be projected back on imagery and cleaned using the normalized difference vegetation index (NDVI) threshold (0.1-0.15). With the extracted building point cloud, a Euclidean clustering method is applied to group the into individual clusters, and the clusters with small area (3-5 m 2 ) are removed as tree clusters. The threshold of a small area is determined by the point density and the minimum number of points per cluster; e.g., if the point density of the point cloud is 4 points/m 2 , and a cluster with 12 points indicate an area of approximate 3 m 2 . After the aforementioned process, these segmented individual buildings can be subsequently reconstructed. It should be emphasized that these parameters (height, NDVI, area threshold) are selected empirically, and the details for these operations and parameters are beyond the scope of this paper.

Roof Plane Extraction and Semantic Labeling
This is a preliminary step for building reconstruction; we first segment the roof point clouds into individual planes by minimizing a global energy function, and then tag its semantics as roof or attachment (e.g., dormer, gable, hipper, chimney). Roof plane segmentation from point clouds is crucial to 3D building reconstruction and is still challenging due to noisy, incomplete, and outlier-ridden data. To achieve accurate and reliable roof planes, a multi-label optimization model [65] was applied, as presented in Equation (1).

Roof Plane Extraction and Semantic Labeling
This is a preliminary step for building reconstruction; we first segment the roof point clouds into individual planes by minimizing a global energy function, and then tag its semantics as roof or attachment (e.g., dormer, gable, hipper, chimney). Roof plane segmentation from point clouds is crucial to 3D building reconstruction and is still challenging due to noisy, incomplete, and outlier-ridden data. To achieve accurate and reliable roof planes, a multi-label optimization model [65] was applied, as presented in Equation (1).
The introduced model can transform the plane extraction problem into the best matching issue by balancing different energy costs on geometric data errors (data cos t), spatial smooth coherence (smooth cos t), and the number of planes (label cos t). The objective of this optimization framework is to assign every point (Data) to the most suitable plane (Label). To get the initial candidate labels for the energy model, the input point cloud will be firstly over-segmented into a set of patches by the Voxel Cloud Connectivity Segmentation (VCCS) algorithm [66] or our previous work [27], where each patch represents a local surface with centroid c, curvature f, and normal vector n. The initial candidate labels can be generated using centroids and normal vectors of surface patches, which are selected from all segmented patches or some patches with small curvature f. In addition, an operation by randomly sampling a subset of patches centroids c is done to enrich the potential candidate labels.
The first data cost term in Equation (1), a geometric error measurement is calculated as the quadratic perpendicular distance between a point to a potential label L p . The second term in (1) is the smoothness between the neighbored point pairs, and the neighborhood can be achieved from triangulated irregular networks (TIN) or k-nearest neighbors (KNN). The indicator function δ(·) for the adjacent points is selected as Potts model [67], and is set to 1 if a pair of points (p, q) belong to the same label, otherwise, it is 0. Intuitively, a pair of points that are closer together are more likely to be on the same plane, thereby the weight function ω pq can be set as an inverse function of the distance between adjacent points (p, q) located on a TIN edge.
Remote Sens. 2021, 13,1946 6 of 25 The label cost item is a penalty for the number of input potential labels. In order to compactly represent the input scene, it is encouraged to use fewer labels. The proposed label model can be written by where |L i | is the numbers of inlier points on the plane with index i. The proposed global energy optimization is an iterative process along with the framework of Propose Expand and Re-estimate Labels ("PEaRL") [68] and terminates only if the energy is no more decreased, resulting in a set of labels for the input point cloud. The final planes can be achieved by fitting the points with the same label index. Similar to the work of Pu and Vosselman [69], the plane semantic features can be further inferred from the knowledge rules (Area, Orientation, Position) into roof or attachment (e.g., dormer, gable, hipper, chimney). It should be mentioned that the plane semantics are optional for the next decomposition process, and the semantics of roof planes can be also classified by the supervised method [70].

Semantic Decomposition of Compound Building
The semantic decomposition is to generate building subparts using a progressive decomposition and grouping algorithm. For each complex building, a roof connection graph is firstly constructed using the extracted planar primitives, and then a decomposing and grouping operation on the roof connection graph will be performed to generate subgraphs, which are potential roof semantic structures (e.g., dormer, gable, hipper, chimney).

Hierarchy Tree Representation of Complex Buildings
A general assumption of the proposed modeling algorithm is that a compound building roof can be represented by various simple and meaningful subparts. Although the styles of buildings are diverse, the basic units (subparts) are similar: a subpart (structure) is a visual-pleased box composed of two or more parametric plane primitives with semantic features.
It should be noted that the plane semantics can be allowed to be empty ∅. Moreover, a general hierarchical-tree-based representation for a complex building was introduced, shown in Figure 2. sub-graphs, which are potential roof semantic structures (e.g., dormer, gable, hipper, chimney).

Hierarchy Tree Representation of Complex Buildings
A general assumption of the proposed modeling algorithm is that a compound building roof can be represented by various simple and meaningful subparts. Although the styles of buildings are diverse, the basic units (subparts) are similar: a subpart (structure) is a visual-pleased box composed of two or more parametric plane primitives with semantic features. It should be noted that the plane semantics can be allowed to be empty Ø . Moreover, a general hierarchical-tree-based representation for a complex building was introduced, shown in Figure  The root of the hierarchical tree, illustrated in Figure 2, is a 3D building model organized by roof semantic subparts, and the planar primitives and roof subparts are treated as leaf nodes and child nodes, respectively. For example, the roof attachment named vertical chimney in the second row is two pairs of parallel planes, while a dormer The root of the hierarchical tree, illustrated in Figure 2, is a 3D building model organized by roof semantic subparts, and the planar primitives and roof subparts are treated as leaf nodes and child nodes, respectively. For example, the roof attachment named vertical chimney in the second row is two pairs of parallel planes, while a dormer is a combination of two adjacent planes.

Construction of the Roof Connection Graph
To generate a reasonable decomposition and grouping for roof subparts extraction, the roof connection graph C, a weighted undirected connected graph, is obtained by the extracted roof planes in the Section 3.1. The vectices (V) in C are roof planar primitives, and an edge E between the two vertices represents spatial connectivity. In addition, local geometric convexity and consistency will be calculated for each edge.
The Euclidian distance F Dist between adjacent planar primitives is firstly set as an attribute of the edge E. The psychophysical studies [71][72][73] suggest that the transition between convex and concave parts might be indicative of the separation between objects and/or their parts. In other words, concave-convex features are the cues for the decomposition objects into semantic subparts. Thus, edges in C are equipped with 3D concave or convex attributes to ensure the reliability and efficiency of building roof decomposition, and the local convexity F con for adjacent planes is calculated by: x 2 are normals and centroids of adjacent planes, respectively. The angle (θ) of the normals to the vector d = → x 1 − → x 2 joining the centroids can be calculated using the dot product. For a convex connection, the angle θ 1 is smaller than θ 2 , while for concave, the opposite is true. The convexity/concavity are shown in Figure 3.
Remote Sens. 2021, 13, x FOR PEER REVIEW 8 of 2 position objects into semantic subparts. Thus, edges in C are equipped with 3D concav or convex attributes to ensure the reliability and efficiency of building roof decomposi tion, and the local convexity Fcon for adjacent planes is calculated by:  The FDist and FCon can be directly marked as attributes for each edge E and stored in roof connection graph C, that is C = {V, E}, E = [FDist, FCon]. Finally, the generated graph C will be the foundation for the next progressive decomposition and grouping operation.

Progressive Decomposition for Subparts Extraction
The commonly used methods for roof subparts extraction are usually accomplished by searching and matching the sub-graph element from a predefined library, and it i usually hampered by some significant problems, such as the completeness of the library the ambiguous definition, and errored sub-graph recognition. When observing objects we will attempt to group similar elements, recognize patterns, and simplify complex images as we look at objects. To achieve the sub-convex building subparts, visual per ception constraints (e.g., proximity, similarity, consistency) derived from the Gestal Principles are applied [71][72][73]. The predefined Euclidian distance FDist and local convexity FCon are the proximity and similarity constraints, respectively. The proximity constraint i straightforward, that is, the planes that are close to each other are more likely to b treated as one group. While the local convexity represents adjacent primitives sharing visual features (such as shape, convexity, and concavity) can group into a perceptiv The F Dist and F Con can be directly marked as attributes for each edge E and stored in roof connection graph C, that is C = {V, E}, E = [F Dist , F Con ]. Finally, the generated graph C will be the foundation for the next progressive decomposition and grouping operation.

Progressive Decomposition for Subparts Extraction
The commonly used methods for roof subparts extraction are usually accomplished by searching and matching the sub-graph element from a predefined library, and it is usually hampered by some significant problems, such as the completeness of the library, the ambiguous definition, and errored sub-graph recognition. When observing objects, we will attempt to group similar elements, recognize patterns, and simplify complex images as we look at objects. To achieve the sub-convex building subparts, visual perception constraints (e.g., proximity, similarity, consistency) derived from the Gestalt Principles are applied [71][72][73]. The predefined Euclidian distance F Dist and local convexity F Con are the proximity and similarity constraints, respectively. The proximity constraint is straightforward, that is, the planes that are close to each other are more likely to be treated as one group. While the local convexity represents adjacent primitives sharing visual Remote Sens. 2021, 13, 1946 8 of 25 features (such as shape, convexity, and concavity) can group into a perceptive group. Therefore, the consistency constraint F CC (shown in Figure 4), preferences to establish a sub-convex box, is introduced to represent continuous concavity/convexity during the progressive decomposition processing. As presented in Figure 4, it indicated that adjacent plane A and B can be grouped only if it satisfies: (1) the edge between plane A and B is labeled as convex, (2) the similarity FCon between plane pairs (A-S, B-S) or (C-S, B-S) should be exactly the same, where S is a shared plane and C is the neighbor of planes A and B that need to be grouped. The consistency constraint criterion FCC is then defined as: To achieve the roof subparts, a progressive iterative decomposition of the roof connection graph is introduced, as illustrated in Algorithm 1.

Input: a roof connection graph G and roof planar primitives PS = [Lp]
Output: roof subparts [GSub] and an initial building hierarchical tree T Find a planar primitive Lp0 with the largest area from PS

3:
Create an empty roof plane set GSub and initial it with plane Lp0 The extraction of building subparts as well as the hierarchical tree is carried out in a progressive manner, which aims to search and find the best set of roof planar primitives that are potential to the same sub-convex box. The detailed iteration of the decomposition and grouping to generate a building roof structure is elaborated in the following: (1) Start from a planar primitive Lp0, which has the largest geometric area, and initial the current group GSub = [Lp0]; As presented in Figure 4, it indicated that adjacent plane A and B can be grouped only if it satisfies: (1) the edge between plane A and B is labeled as convex, (2) the similarity F Con between plane pairs (A-S, B-S) or (C-S, B-S) should be exactly the same, where S is a shared plane and C is the neighbor of planes A and B that need to be grouped. The consistency constraint criterion F CC is then defined as: To achieve the roof subparts, a progressive iterative decomposition of the roof connection graph is introduced, as illustrated in Algorithm 1.

Algorithm 1. Progressive Decomposition
Input: a roof connection graph G and roof planar primitives PS = [L p ] Output: roof subparts [G Sub ] and an initial building hierarchical tree T Find a planar primitive L p0 with the largest area from PS 3: Create an empty roof plane set G Sub and initial it with plane L p0 4: Generate G Sub by the iteratively decomposing G using (F Dist , F Con , F CC ) 5: Update the nodes of building hierarchical tree T from G Sub 6: For L pi ∈ G Sub do 7: remove plane L pi from PS 8: update the nodes and edges of G 9: End For 10: End While The extraction of building subparts as well as the hierarchical tree is carried out in a progressive manner, which aims to search and find the best set of roof planar primitives that are potential to the same sub-convex box. The detailed iteration of the decomposition and grouping to generate a building roof structure is elaborated in the following: (1) Start from a planar primitive L p0 , which has the largest geometric area, and initial the current group G Sub = [L p0 ]; (2) Create a candidate plane set G candidate that all primitives are connected to the last added element of G Sub , which means that there exists an edge in G. If the candidate set G candidate is empty or all candidate elements are grouped, the current grouping loop ends; (3) Calculate the convexity F Con and consistency F CC for each element in G candidate , and remove the ones that cannot meet these constraints; (4) Sort the remaining candidate primitives in G candidate according to principles of the closest connected distance F Dist and the same semantics, and the candidate element with the minimum F Dist will be grouped into G Sub . If the set G candidate is empty, the decomposition will terminate; (5) Go to step (2). When a building roof subpart is grouped according to the aforementioned decomposition steps, the nodes of the building hierarchical tree will be generated, and the information of the grouped primitives will be simultaneously updated from G. Moreover, this iterative decomposition will be terminated when all input roof planes are grouped. Once the decomposition from G is finished, sub-graphs of different building parts can be obtained, as shown in Figure 5. (2) Create a candidate plane set Gcandidate that all primitives are connected to the last added element of GSub, which means that there exists an edge in G. If the candidate set Gcandidate is empty or all candidate elements are grouped, the current grouping loop ends; (3) Calculate the convexity FCon and consistency FCC for each element in Gcandidate, and remove the ones that cannot meet these constraints; (4) Sort the remaining candidate primitives in Gcandidate according to principles of the closest connected distance FDist and the same semantics, and the candidate element with the minimum FDist will be grouped into GSub. If the set Gcandidate is empty, the decomposition will terminate; (5) Go to step (2). When a building roof subpart is grouped according to the aforementioned decomposition steps, the nodes of the building hierarchical tree will be generated, and the information of the grouped primitives will be simultaneously updated from G. Moreover, this iterative decomposition will be terminated when all input roof planes are grouped. Once the decomposition from G is finished, sub-graphs of different building parts can be obtained, as shown in Figure 5.

Generation of 3D Building Models
Due to the limit of acquisition devices and scene occlusions, these subparts cannot be correctly interpreted or identified as unambiguous building structures. Thus, a refinement step will be introduced for each grouped subpart using the constraints of symmetry and closure to produce a visually pleasing 3D model in the final. As each primitive connected to its adjacent planes ought to be convex, thus, the closure is that the projected primitives should be connected in sequence and made a closed loop. The local symmetry is used to fulfill the missing or extend ghost primitive based on the architecture aesthetics. As shown in Figure 6, it is performed on whether the normal vector projections of adjacent planes are parallel to each other.

Generation of 3D Building Models
Due to the limit of acquisition devices and scene occlusions, these subparts cannot be correctly interpreted or identified as unambiguous building structures. Thus, a refinement step will be introduced for each grouped subpart using the constraints of symmetry and closure to produce a visually pleasing 3D model in the final. As each primitive connected to its adjacent planes ought to be convex, thus, the closure is that the projected primitives should be connected in sequence and made a closed loop. The local symmetry is used to fulfill the missing or extend ghost primitive based on the architecture aesthetics. As shown in Figure 6, it is performed on whether the normal vector projections of adjacent planes are parallel to each other. (2) Create a candidate plane set Gcandidate that all primitives are connected to the last added element of GSub, which means that there exists an edge in G. If the candidate set Gcandidate is empty or all candidate elements are grouped, the current grouping loop ends; (3) Calculate the convexity FCon and consistency FCC for each element in Gcandidate, and remove the ones that cannot meet these constraints; (4) Sort the remaining candidate primitives in Gcandidate according to principles of the closest connected distance FDist and the same semantics, and the candidate element with the minimum FDist will be grouped into GSub. If the set Gcandidate is empty, the decomposition will terminate; (5) Go to step (2). When a building roof subpart is grouped according to the aforementioned decomposition steps, the nodes of the building hierarchical tree will be generated, and the information of the grouped primitives will be simultaneously updated from G. Moreover, this iterative decomposition will be terminated when all input roof planes are grouped. Once the decomposition from G is finished, sub-graphs of different building parts can be obtained, as shown in Figure 5.

Generation of 3D Building Models
Due to the limit of acquisition devices and scene occlusions, these subparts cannot be correctly interpreted or identified as unambiguous building structures. Thus, a refinement step will be introduced for each grouped subpart using the constraints of symmetry and closure to produce a visually pleasing 3D model in the final. As each primitive connected to its adjacent planes ought to be convex, thus, the closure is that the projected primitives should be connected in sequence and made a closed loop. The local symmetry is used to fulfill the missing or extend ghost primitive based on the architecture aesthetics. As shown in Figure 6, it is performed on whether the normal vector projections of adjacent planes are parallel to each other.  For any pair of adjacent primitives in Figure 6, we firstly project its normal vectors ( → n 1 , → n 2 ) onto the ground plane, and then an analysis is performed on whether the projected normal vectors ( → n p1 , → n p2 ) are mutually parallel with respect to its intersection: if mutually parallel the adjacent planes are symmetric. Moreover, the details of enhancing these decomposed building subparts are elaborated as follows: (1) For any extracted subpart, we firstly extract its corresponding sub-node and inlier leaf nodes in Figure 5d.
(2) Calculate the local symmetry indicators of adjacent primitives and perform it.
(3) A closed hull loop detection, stitching together the projected primitives in sequence, will be performed based on closure perception laws. Moreover, an add and union primitive operation will be carried out.
Subpart labeled as the roof: Check the outer border ring of the primitives projected to the ground plane, and if the loop is a concave hull, which means that there exists an incomplete closed loop, a "extend ghost" primitive will be accomplished by searching a connected plane from the roof connection graph or stitching the vertexes of the nearest primitives along the loop. Especially, if the newly added "extend ghost" primitive is parallel to its adjacent, a plane union operation will be performed.
Subpart labeled as dormer: We usually handle the missing vertical primitive along with the boundary loop, where the projected plane is perpendicular to the normal vector.
Subpart labeled as chimney: There exist two types of chimney: column and cone. The missed primitives will be fulfilled along the boundary, and the projection plane is the ground plane. The difference between them is that the fixed source vertex is the same for a cone part.
(4) A similar regularization process [74] is applied to the refined building subparts to produce a 3D geometric vector boundary, and the changed information in the hierarchical tree will be synchronously updated.

Description of the Datasets
The proposed approach has been implemented with the computational geometry algorithms library (CGAL) [75] and the point cloud library (PCL) [66], and mainly tested on datasets with different point densities and urban characteristics. An overview of the tested datasets is shown in Figure 7. The first dataset is the Guangdong data in China, which has a high point density and buildings of various shapes and sizes, and the next one is the NYU ALS dataset released by the center for urban science and progress of the New York University [76]. The last widely adopted benchmark dataset, obtained from ISPRS Test Project on Urban Classification and 3D Building Reconstruction [16], is located in the city of Vaihingen, German.
The Guangdong dataset was obtained in 2016 using the Trimble Harrier 68i with an average height of 800 m. It is located in a rural region covering an area of approximately 340 × 360 m 2 and includes 83 buildings with 257 planes in various shapes and sizes. The point density is approximately 13 points/m 2 , but has some missing areas as the occlusion, which can easily be prone to failures using the current 3D reconstruction methods. Moreover, the NYU dataset is a high-density ALS data for urban areas and contains a complex set of roof types such as multi-layered and flat. The point density is approximately 123 points/m 2 , while the ISPRS benchmark datasets in Area 1-3 were obtained by a Leica ALS50 system in 2008 with a point density of 4-7 points/m 2 . There are 37 historic buildings with complex structures and irregular boundaries in Area 1, while Area 2 is characterized by high-rise residential buildings. The roof boundaries are very complex, and the gaps between adjacent roofs have large height differences. Area 3 is a purely residential area, including 56 buildings with many small roof structures. Moreover, the modeling results derived from the Vaihingen benchmark dataset can be evaluated by ISPRS and can be compared with other methods using a unified standard [16]. As the assessment by ISPRS have terminated, we will illustrate the evaluation on Area 1 and 3, and the other three datasets will be assessed using the same geometric errors.  The Guangdong dataset was obtained in 2016 using the Trimble Harrier 68i with an average height of 800m. It is located in a rural region covering an area of approximately 340 × 360 m 2 and includes 83 buildings with 257 planes in various shapes and sizes. The point density is approximately 13 points/m 2 , but has some missing areas as the occlusion, which can easily be prone to failures using the current 3D reconstruction methods. Moreover, the NYU dataset is a high-density ALS data for urban areas and contains a complex set of roof types such as multi-layered and flat. The point density is approximately 123 points/m 2 , while the ISPRS benchmark datasets in Area 1-3 were obtained by a Leica ALS50 system in 2008 with a point density of 4-7 points/m 2 . There are 37 historic buildings with complex structures and irregular boundaries in Area 1, while Area 2 is characterized by high-rise residential buildings. The roof boundaries are very complex, and the gaps between adjacent roofs have large height differences. Area 3 is a purely residential area, including 56 buildings with many small roof structures. Moreover, the modeling results derived from the Vaihingen benchmark dataset can be evaluated by ISPRS and can be compared with other methods using a unified standard [16]. As the assessment by ISPRS have terminated, we will illustrate the evaluation on Area 1 and 3, and the other three datasets will be assessed using the same geometric errors.

Results of Model Reconstruction
In the aforementioned datasets, a series of representative buildings are selected to validate the proposed approach. These compound buildings illustrated in Figure 8 are reconstructed by a set of basic building subparts, which are a combination of planar primitives.
It can be seen from Figure 8 that the generated compound buildings in part (a) are assembled by semantic building units in part (c), including hipped roof, dormer, etc. In part (b), the 3D wireframe of each reconstructed building generated is a hierarchical topology tree, which is organized by reconstructed building subparts in part (c). These generated building subparts with an explicit topology can be further used to enrich the building model library or construct public training data for supervised learning. The proposed approach aims to correctly and automatically reconstruct building subparts, and the additional semantics are to be inferred by the dominant semantics of the grouped planes. It is beyond the scope of this paper to accurately interpret the semantics data for various styles. In addition, the final 3D models are illustrated in Figure 9.
In the aforementioned datasets, a series of representative buildings are selected to validate the proposed approach. These compound buildings illustrated in  It can be seen from Figure 8 that the generated compound buildings in part (a) are assembled by semantic building units in part (c), including hipped roof, dormer, etc. In part (b), the 3D wireframe of each reconstructed building generated is a hierarchical topology tree, which is organized by reconstructed building subparts in part (c). These generated building subparts with an explicit topology can be further used to enrich the building model library or construct public training data for supervised learning. The proposed approach aims to correctly and automatically reconstruct building subparts, and the additional semantics are to be inferred by the dominant semantics of the grouped planes. It is beyond the scope of this paper to accurately interpret the semantics data for various styles. In addition, the final 3D models are illustrated in Figure 9.   It can be seen from Figure 8 that the generated compound buildings in part (a) are assembled by semantic building units in part (c), including hipped roof, dormer, etc. In part (b), the 3D wireframe of each reconstructed building generated is a hierarchical topology tree, which is organized by reconstructed building subparts in part (c). These generated building subparts with an explicit topology can be further used to enrich the building model library or construct public training data for supervised learning. The proposed approach aims to correctly and automatically reconstruct building subparts, and the additional semantics are to be inferred by the dominant semantics of the grouped planes. It is beyond the scope of this paper to accurately interpret the semantics data for various styles. In addition, the final 3D models are illustrated in Figure 9.  Different from the ISPRS benchmark data (Area 1 and Area 3), the Guangdong data is a private testing data and public NYU is a non-standard dataset, thus, the various standard internal consistency metrics cannot be assessed by the ISPRS. In addition, the assessment on ISPRS Area 2 is missed as the ISPRS evaluation is stopped. Therefore, the evaluations of the three datasets are performed by a simple internal quality and a visual judgment. Results of visual judgment are shown in Figure 9a-c, while the internal quality are the reconstructed geometric reconstructed error and the rate of fully reconstructed buildings. The geometric reconstructed errors, a distance from a point to a reconstructed plane (average), are approximately 0.033m (Guangdong), 0.021 m (NYU), and 0.2 m (ISPRS Area 2). In addition, a total of 77 buildings (252 roof planes) were successfully reconstructed, achieving a fully reconstructed 92.7% of the original 83 buildings for Guangdong data. These buildings are fully reconstructed in the public NYU and ISPRS Area 2 datasets, and the complex roof types in urban areas, like overhanging roof, multi-layer roofs, and flat roofs, are successfully modelled. The overhanging roof is usually a single plane in the constructed roof connection graph and can be easily grouped. While the complex multi-layer roofs can be reconstructed as a variety of different roof struc- Different from the ISPRS benchmark data (Area 1 and Area 3), the Guangdong data is a private testing data and public NYU is a non-standard dataset, thus, the various standard internal consistency metrics cannot be assessed by the ISPRS. In addition, the assessment on ISPRS Area 2 is missed as the ISPRS evaluation is stopped. Therefore, the evaluations of the three datasets are performed by a simple internal quality and a visual judgment. Results of visual judgment are shown in Figure 9a-c, while the internal quality are the reconstructed geometric reconstructed error and the rate of fully reconstructed buildings. The geometric reconstructed errors, a distance from a point to a reconstructed plane (average), are approximately 0.033 m (Guangdong), 0.021 m (NYU), and 0.2 m (ISPRS Area 2). In addition, a total of 77 buildings (252 roof planes) were successfully reconstructed, achieving a fully reconstructed 92.7% of the original 83 buildings for Guangdong data. These buildings are fully reconstructed in the public NYU and ISPRS Area 2 datasets, and the complex roof types in urban areas, like overhanging roof, multi-layer roofs, and flat roofs, are successfully modelled. The overhanging roof is usually a single plane in the constructed roof connection graph and can be easily grouped. While the complex multi-layer roofs can be reconstructed as a variety of different roof structures, and flat roofs with different height are fully modelled as different parts in the testing Areas. The facades in the NYU dataset are ignored in the current scheme.
Furthermore, for the ISPRS benchmark datasets of Vaihingen (Area 1 and 3), it allows us to use external reference data and assess the result according to unified criteria against other modeling methods [16]. The results are listed in Table 1.  22 34 It can be seen from Table 1 that the proposed method for building roof reconstruction has achieved 201 correctly reconstructed out of 202 in Area 1 (99.5% correctly reconstructed). While for Area 3, the number of correctly reconstructed planes is 130, reaching 97.7%. The most common reason for the false reconstructions (FP) is the lack of insufficient points.

Visual Analysis of the Decomposition Results
In this section, the decomposed building subparts by the proposed approach, as shown in Figure 10, are compared with the commonly used building reconstruction methods. These different 3D models generated from the same compound building proves our novelty.  1 and 3), it allows us to use external reference data and assess the result according to unified criteria against other modeling methods [16]. The results are listed in Table 1. It can be seen from Table 1 that the proposed method for building roof reconstruction has achieved 201 correctly reconstructed out of 202 in Area 1 (99.5% correctly reconstructed). While for Area 3, the number of correctly reconstructed planes is 130, reaching 97.7%. The most common reason for the false reconstructions (FP) is the lack of insufficient points.

Visual Analysis of the Decomposition Results
In this section, the decomposed building subparts by the proposed approach, as shown in Figure 10, are compared with the commonly used building reconstruction methods. These different 3D models generated from the same compound building proves our novelty.  It can be seen from part (b) of Figure 10 that the proposed approach can achieve more unambiguous and meaningful building subparts. Verma et al. [51] used an exhaustive search to fit the point clouds to the best matched predefined simple GU, GL, and GI models, as shown in part (c). The final matched model is limited to a simple polygon model and cannot be used flexibly because it requires a more complex building library to be defined in advance. Xiong et al. [5] defined an improved roof topology graph to reconstruct 3D building models and, achieved the inner and outer corners from the concurrent planes or boundary points, thereby forming a combined building model linked to all inner and outer corners, as shown in part (d). It turns out to be more adaptive than similar work [77]. However, it is difficult to obtain the topological relationship of different corners and connected lines. The generated geometric models by minimum cycle analysis need to be checked because a corner may not be expressed or matched by the predefined minimum cycles. In addition, the semantics of the matched roof components are always omitted. Differing from the aforementioned approaches, the proposed automatic 3D modeling approach can reconstruct building semantic subparts, which can be easily interpreted as building structures. The decomposition results in Figure 10 (b) are different hipped roofs, which can be easily inferred by the human being. Each roof subpart is a combination of parametric planar primitives and can be further used to assemble a hierarchy-tree of a building.
Different from the previous approach [48], we have made improvements in the current status to generate structural building models with the introduced semantics. The plane semantics are added for the roof connection graph, semantic decomposition, and It can be seen from part (b) of Figure 10 that the proposed approach can achieve more unambiguous and meaningful building subparts. Verma et al. [51] used an exhaustive search to fit the point clouds to the best matched predefined simple GU, GL, and GI models, as shown in part (c). The final matched model is limited to a simple polygon model and cannot be used flexibly because it requires a more complex building library to be defined in advance. Xiong et al. [5] defined an improved roof topology graph to reconstruct 3D building models and, achieved the inner and outer corners from the concurrent planes or boundary points, thereby forming a combined building model linked to all inner and outer corners, as shown in part (d). It turns out to be more adaptive than similar work [77]. However, it is difficult to obtain the topological relationship of different corners and connected lines. The generated geometric models by minimum cycle analysis need to be checked because a corner may not be expressed or matched by the predefined minimum cycles. In addition, the semantics of the matched roof components are always omitted. Differing from the aforementioned approaches, the proposed automatic 3D modeling approach can reconstruct building semantic subparts, which can be easily interpreted as building structures. The decomposition results in Figure 10b are different hipped roofs, which can be easily inferred by the human being. Each roof subpart is a combination of parametric planar primitives and can be further used to assemble a hierarchy-tree of a building.
Different from the previous approach [48], we have made improvements in the current status to generate structural building models with the introduced semantics. The plane semantics are added for the roof connection graph, semantic decomposition, and roof subparts refinement to generate 3D building models; it is more helpful to reconstruct structural building subparts. The decomposed results can be interpreted as different semantic structures, where the semantics can be inferred from the largest number of semantic planes. By introducing the semantics into the iterative decomposition and grouping algorithm, it can be easily extended to house modeling from a multi-source point cloud, e.g., the groundbased point cloud can provide more details of building façades, thus, we can reconstruct more refined house models in LOD3, LOD4 by combining these different points cloud. A visual comparison on the proposed and the previous approach is shown in Figure 11, and the difference and improvement are that whether they can interpret the grouped roof subparts as different semantic structures. roof subparts refinement to generate 3D building models; it is more helpful to reconstruct structural building subparts. The decomposed results can be interpreted as different semantic structures, where the semantics can be inferred from the largest number of semantic planes. By introducing the semantics into the iterative decomposition and grouping algorithm, it can be easily extended to house modeling from a multi-source point cloud, e.g., the ground-based point cloud can provide more details of building façades, thus, we can reconstruct more refined house models in LOD3, LOD4 by combining these different points cloud. A visual comparison on the proposed and the previous approach is shown in Figure 11, and the difference and improvement are that whether they can interpret the grouped roof subparts as different semantic structures. Furthermore, the decomposed roof subparts we proposed are basic building units without overlapping elements in the reconstruction process and can produce more unambiguous 3D models, as presented in Figure 12. It can be seen from Figure 12b that the generated 3D building subparts by the proposed approach can avoid multiple matching of the same building element. For these standard matched methods [5,77], the roof topology graph defined by different forms is searched from the predefined library and decomposed into elementary graphs. These automatically recognized subgraphs enable us to assign semantics to all extracted building planar primitives and assemble them into an appropriate 3D model. However, Furthermore, the decomposed roof subparts we proposed are basic building units without overlapping elements in the reconstruction process and can produce more unambiguous 3D models, as presented in Figure 12. roof subparts refinement to generate 3D building models; it is more helpful to reconstruct structural building subparts. The decomposed results can be interpreted as different semantic structures, where the semantics can be inferred from the largest number of semantic planes. By introducing the semantics into the iterative decomposition and grouping algorithm, it can be easily extended to house modeling from a multi-source point cloud, e.g., the ground-based point cloud can provide more details of building façades, thus, we can reconstruct more refined house models in LOD3, LOD4 by combining these different points cloud. A visual comparison on the proposed and the previous approach is shown in Figure 11, and the difference and improvement are that whether they can interpret the grouped roof subparts as different semantic structures. Furthermore, the decomposed roof subparts we proposed are basic building units without overlapping elements in the reconstruction process and can produce more unambiguous 3D models, as presented in Figure 12. It can be seen from Figure 12b that the generated 3D building subparts by the proposed approach can avoid multiple matching of the same building element. For these standard matched methods [5,77], the roof topology graph defined by different forms is searched from the predefined library and decomposed into elementary graphs. These automatically recognized subgraphs enable us to assign semantics to all extracted building planar primitives and assemble them into an appropriate 3D model. However, It can be seen from Figure 12b that the generated 3D building subparts by the proposed approach can avoid multiple matching of the same building element. For these standard matched methods [5,77], the roof topology graph defined by different forms is searched from the predefined library and decomposed into elementary graphs. These automatically recognized subgraphs enable us to assign semantics to all extracted building planar primitives and assemble them into an appropriate 3D model. However, there will inevitably be a problem, that is, the standard-matching reconstruction methods rebuild the same roof elements repeatedly, resulting in overlapping roof primitives, as shown in the mid of Figure 12c.

Performance Analysis of Multi-Label Energy Optimization
To further investigate the effects of the global energy-based optimization procedure, we have calculated the iterations and runtimes, as shown in Table 2. It can be found from Table 2 that the number of iterations is located in a lower range, which means that the designed cost functions are stable and balanced. Along with the iteration, the energy will be sharply dropped, leading to a quick convergence. Compared with the RANSAC plane fitting, the running time is relatively long as the optimization is performed on each point. One possible improvement is to change the assignment issue from "point-to-plane" to "supervoxel-to-plane". Moreover, the results of roof plane extraction in Area 1 and Area 3 are compared to a traditional multi-model fitting method like RANSAC, as shown in Figure 13. there will inevitably be a problem, that is, the standard-matching reconstruction methods rebuild the same roof elements repeatedly, resulting in overlapping roof primitives, as shown in the mid of Figure 12c.

Performance Analysis of Multi-Label Energy Optimization
To further investigate the effects of the global energy-based optimization procedure, we have calculated the iterations and runtimes, as shown in Table 2. It can be found from Table 2 that the number of iterations is located in a lower range, which means that the designed cost functions are stable and balanced. Along with the iteration, the energy will be sharply dropped, leading to a quick convergence. Compared with the RANSAC plane fitting, the running time is relatively long as the optimization is performed on each point. One possible improvement is to change the assignment issue from "point-to-plane" to "supervoxel-to-plane". Moreover, the results of roof plane extraction in Area 1 and Area 3 are compared to a traditional multi-model fitting method like RANSAC, as shown in Figure 13. It can be seen from the Figure 13 that the global energy-optimized approach can be effective to extract roof planes. Compared with a traditional multi-model fitting method like RANSAC, the proposed approach can overcome inconsistencies such as noise and missing data in plane transitions and is more beneficial to construct the adjacent relationship between roof planes.

Accuracy Assessments on ISPRS Benchmark Dataset
The geometric accuracy of the reconstructed 3D models derived from the benchmark data of Vaihingen in Areas 1 and 3 are evaluated by ISPRS using the standardized validation methods. The metrics of completeness, correctness, and quality, defined by Rutzinger et al. [78] are evaluated based on the mutual overlapping with reference data. The quality results are shown in Figure 14. It can be seen from the Figure 13 that the global energy-optimized approach can be effective to extract roof planes. Compared with a traditional multi-model fitting method like RANSAC, the proposed approach can overcome inconsistencies such as noise and missing data in plane transitions and is more beneficial to construct the adjacent relationship between roof planes.

Accuracy Assessments on ISPRS Benchmark Dataset
The geometric accuracy of the reconstructed 3D models derived from the benchmark data of Vaihingen in Areas 1 and 3 are evaluated by ISPRS using the standardized validation methods. The metrics of completeness, correctness, and quality, defined by Rutzinger et al. [78] are evaluated based on the mutual overlapping with reference data. The quality results are shown in Figure 14.  Figure 14. The quality metrics of the ISPRS benchmark dataset.
The quality of per-area level for Area 3 reaches 96.0%, while Area 1 large than 93.4%. These high precision results can be benefited from the proposed global optimization, which can preserve the correct segmentation at plane transition regions with sparse points. In addition, the comparison of the reconstructed planes with the reference information is illustrated in Figure 15, where the 3D information has been converted to a label image by ISPRS. The quality of per-area level for Area 3 reaches 96.0%, while Area 1 large than 93.4%. These high precision results can be benefited from the proposed global optimization, which can preserve the correct segmentation at plane transition regions with sparse points. In addition, the comparison of the reconstructed planes with the reference information is illustrated in Figure 15, where the 3D information has been converted to a label image by ISPRS.  Figure 14. The quality metrics of the ISPRS benchmark dataset.
The quality of per-area level for Area 3 reaches 96.0%, while Area 1 large than 93.4%. These high precision results can be benefited from the proposed global optimization, which can preserve the correct segmentation at plane transition regions with sparse points. In addition, the comparison of the reconstructed planes with the reference information is illustrated in Figure 15, where the 3D information has been converted to a label image by ISPRS. The comparative verification ( Figure 15) indicates that 3D building components are successfully achieved during the reconstruction process. Buildings that are not correctly reconstructed are one and three for Area 1 and 3, respectively. These failures, assessed as False Positive (FP), are filled within the buildings that undetected in the preprocessing process of point cloud classification, because houses are surrounded by trees, which makes it difficult to extract building point cloud. Moreover, the geometrical accuracy of state-of-the-art methods described on the ISPRS website is selected for comparison, as presented in Figure 16. The comparative verification ( Figure 15) indicates that 3D building components are successfully achieved during the reconstruction process. Buildings that are not correctly reconstructed are one and three for Area 1 and 3, respectively. These failures, assessed as False Positive (FP), are filled within the buildings that undetected in the preprocessing process of point cloud classification, because houses are surrounded by trees, which makes it difficult to extract building point cloud. Moreover, the geometrical accuracy of state-ofthe-art methods described on the ISPRS website is selected for comparison, as presented in Figure 16. The comparative verification ( Figure 15) indicates that 3D building components are successfully achieved during the reconstruction process. Buildings that are not correctly reconstructed are one and three for Area 1 and 3, respectively. These failures, assessed as False Positive (FP), are filled within the buildings that undetected in the preprocessing process of point cloud classification, because houses are surrounded by trees, which makes it difficult to extract building point cloud. Moreover, the geometrical accuracy of state-of-the-art methods described on the ISPRS website is selected for comparison, as presented in Figure 16.  [79], ITCX3 [5], YOR [80], TUD2 [50], and HRTT [54] are shorts for participant.
The accuracy of the final generated models may be affected by a variety of factors, such as building detection and segmentation, the strategy of reconstruction. Additionally, an excellent reconstruction algorithm is to find a balance to generate building models. From most of the evaluation methods presented in Figure 16, one of the two indicators (RMS and RMSZ) exceeds the median value, while the other is obviously reduced. In terms of quantitative results, none of the methods is significantly better than others. For the proposed reconstructed approach, the average horizontal error is 0.8 m (Area 1) and 0.6 m (Area 3), while the vertical error is 0.3 m (Area 1) and 0.29 m (Area 3). Even though the two metrics of the proposed method are not optimal, they are similar to the median value, which means that we have achieved a balance between the reconstructed RMS and RMSZ. The main reason for achieving the balance depends on the global optimization of roof plane extraction, and more importantly, it is easy to obtain an unambiguous principal direction of regularization from each decomposed building subpart.

Conclusions
In this paper, we present a novel method for complex building reconstruction from 3D point clouds using the local geometric constraints. The output of the reconstruction is a combination of unambiguous unit blocks with no overlapping elements, which are assembled in a hierarchical topology tree. By first constructing a roof connection graph using the extracted roof planar primitives, we developed semantic-specific reconstruction strategies with local geometric constraints to obtain visually attractive building models. The key aim is to decompose a compound building model into semantic subparts with fixed planar parameters and topological relationships, through a progressive hierarchical grouping operation.
The performed reconstruction experiments indicate that the proposed approach can simplify the reconstruction process and generate a combination of gabled or hipped roofs with precisely reconstructed geometric features. Moreover, these generated building subparts can be further used to enrich the building of a model library or construct public training data for supervised reconstruction. However, the proposed modeling scheme for building reconstruction has some limitations, leading to the failure of the generated 3D models. These limitations include the lack of adjacent roof segments, Figure 16. Geometrical accuracy comparison of the reconstructed models. The CKU [79], ITCX3 [5], YOR [80], TUD2 [50], and HRTT [54] are shorts for participant.
The accuracy of the final generated models may be affected by a variety of factors, such as building detection and segmentation, the strategy of reconstruction. Additionally, an excellent reconstruction algorithm is to find a balance to generate building models. From most of the evaluation methods presented in Figure 16, one of the two indicators (RMS and RMSZ) exceeds the median value, while the other is obviously reduced. In terms of quantitative results, none of the methods is significantly better than others. For the proposed reconstructed approach, the average horizontal error is 0.8 m (Area 1) and 0.6 m (Area 3), while the vertical error is 0.3 m (Area 1) and 0.29 m (Area 3). Even though the two metrics of the proposed method are not optimal, they are similar to the median value, which means that we have achieved a balance between the reconstructed RMS and RMSZ. The main reason for achieving the balance depends on the global optimization of roof plane extraction, and more importantly, it is easy to obtain an unambiguous principal direction of regularization from each decomposed building subpart.

Conclusions
In this paper, we present a novel method for complex building reconstruction from 3D point clouds using the local geometric constraints. The output of the reconstruction is a combination of unambiguous unit blocks with no overlapping elements, which are assembled in a hierarchical topology tree. By first constructing a roof connection graph using the extracted roof planar primitives, we developed semantic-specific reconstruction strategies with local geometric constraints to obtain visually attractive building models.
The key aim is to decompose a compound building model into semantic subparts with fixed planar parameters and topological relationships, through a progressive hierarchical grouping operation.
The performed reconstruction experiments indicate that the proposed approach can simplify the reconstruction process and generate a combination of gabled or hipped roofs with precisely reconstructed geometric features. Moreover, these generated building subparts can be further used to enrich the building of a model library or construct public training data for supervised reconstruction. However, the proposed modeling scheme for building reconstruction has some limitations, leading to the failure of the generated 3D models. These limitations include the lack of adjacent roof segments, sparse points for the local symmetry processing, and the reconstruction of free-from objects. For future work, there are some possible improvements. For example, higher density and quality points