1. Introduction
In recent years, the development of ‘smart city’ [
1] and digital twin [
2] has been rapid, which puts forward higher and higher needs for the accuracy and updating speed of urban building models. As a product of the deep integration of UAV and photogrammetry technology, oblique photography technology can collect high-precision multi-vision stereo images conveniently and cheaply, and it is becoming the main means for the rapid updating of urban spatial information [
3,
4].
Three-dimensional building information recovery based on multi-view stereo images (Multiple View Stereo, MVS) has been relatively mature. Many open source MVS frameworks such as COLMAP, OpenMVG+OpenMVS, and AliceVision can realize the automatic generation of dense point clouds [
5,
6,
7,
8]. Meanwhile, the algorithm of building mesh model reconstruction with point cloud data has been fully studied, ranging from basic Delaunay triangulation [
9] and marching cubes [
10] to Poisson reconstruction [
11]. Meshlab [
12], Geomagic Wrap [
13], Agisoft Metashape [
14], VisualSFM [
15], and other software are also accessible for the pursuit of meshing techniques. These algorithms and software use the triangular facet as the representation primitive and maximize the authenticity by the approximate fitting of building surfaces.
As a common way of geometric representation, the mesh model can fit the surface details of buildings well, has a more realistic visual effect, and is not limited by the type of buildings. However, there are still many issues to be solved; a large number of triangular facets without the high-level structural information of buildings cause the transmission, processing, and display performance to be very low, and it is difficult to meet the needs of online real-time updates. In addition, the accuracy of the multi-view stereo image acquisition process and the processing of triangulated network model generation will introduce a lot of noise and outliers, and its subsequent processing and application will become very difficult. To reduce the number of meshes, different methods of mesh simplification are derived based on the strategy of mesh element deletion or polygonization.
The methods based on mesh element deletion keep the mesh representation form unchanged and use fewer primitives to achieve a balance between the expression effect and the data size. They can be roughly divided into three types: vertex clustering, resampling, and incremental decimation. The vertex clustering method uses a bounding box to surround the original mesh model and divides the bounding box into several small cuboids by equally dividing the edges of the bounding box [
16]. After that, all the vertices in each rectangular body are deleted and a new vertex is generated. Resampling reduces the number of meshes through a local smoothing operator, such as the Instant Filed-Aligned (IFA) method [
17]. The incremental decimation method iteratively compares the error metrics of two adjacent points or edges to determine whether to delete or merge, such as the Quadric Error metrics (QEM) method [
18], the Structure-Aware Mesh Decimation (SAMD) method [
19], etc. Li and Nan [
20] implemented the feature-preserving edge folding algorithm after bilateral filtering of the mesh, which improved the algorithm’s anti-noise and structure-preserving ability. This kind of method can usually obtain a simplified model with less loss of precision when the compression ratio is small, but it cannot achieve the optimal compression effect. If the complexity is reduced too much, the whole model will fall ill.
The methods based on polygonization use a small number of polygons to represent the main plane regions of buildings through the extraction of contour lines or plane regions and the restoration of topological relations. The Variational Shape Approximation (VSA) method [
21] is used to construct the polygonal model by extracting and assembling agent planes from the mesh. Kelly et al. [
22] proposed the BigSur method, which used street view images and GIS vector images to optimize the mesh model and obtained a structured model with facade details. These methods mainly maintain the edge features and surface connection relations but do not make good use of the abundant prior topological information of artificial features. Bouzas et al. [
23] proposed a Structure-Aware Building Mesh Polygonization (SABMP) strategy to generate a manifold watertight polyhedral model after segmentation, construction of structure graph, and candidate face generation and optimization. This approach is not ideal in terms of ease of use and model quality. Our method makes some improvements on the shortages of SABMP, and better usability is obtained.
Buildings in urban regions are usually polyhedral structures, so it is a more ideal choice to use polyhedral models to express buildings [
24,
25]. Building a polygonal model uses fewer facets to represent the main plane of the building, removing trivial details and retaining the structural information of the building to the maximum extent. With the growing demand for smart city online updating, planning management, virtual reality, and augmented reality, the polyhedral model is bound to become a more mainstream digital city solution [
26].
However, due to the influence of the data quality of the mesh model, the errors in the segmentation results are unavoidable, which affects the quality of the polygonal model and even leads to the failure of reconstruction. The polygonization process often depends on the correct setting of parameters, and different mesh models often need different parameters, so the parameters need to be fine-tuned to obtain the best modeling results. Usually, it takes a considerable amount of time to perform parameter tuning. This tedious procedure seriously affects the practicability of the polygonization algorithm.
Modern style building structures are subject to many predefined rules, such as parallelism, repetition, symmetry, and so on. It is necessary and effective to use these constraints to make topology optimization on urban buildings. Earlier topology optimization efforts assumed a Manhattan world [
27]. In the past ten years, topology optimization research has focused on two aspects: topology adaptation based on predefined primitive bases and topology regularization based on point/line/plane constraints.
For aesthetic and economical purposes, urban and rural buildings often have repetitive patterns, and many scholars have tried to fit different types of buildings with a set of templates. Nardinocchi et al. [
28] is the first person to formally propose the model-driven reconstruction method. The commonly used structural primitive base of urban buildings is created and then fit with airborne LIDAR data to complete the three-dimensional reconstruction of the roof surface of buildings. Xiong et al. [
29] used the model-driven method to decompose the roof surface of a complex building into the primitives in the predefined primitive base according to the topology, then introduced adaptive constraints to reconstruct the primitives, and finally combined them into the overall model. The downside of this approach is that buildings vary so much from place to place that it is difficult to find a universal base to define all buildings.
As the main place of human activities, the linear and planar features of buildings are generally regular, which is a very important constraint condition. However, the topological completeness of features extracted directly from the mesh is often unable to be guaranteed due to the presence of noise. In the Polyfit method [
30], the binary linear programming method is used to optimize the candidate surfaces to recover the optimal intersection relationship of the plane elements and obtain the watertight polygon surface model. Chen et al. [
31] proposed a topology-aware roof surface reconstruction method for 2.5D buildings. The boundary extracted from the Voronoi diagram was processed based on topological principles such as primitive occlusion processing, inner hole representation, and abstract processing. This topology optimization strategy optimizes the model according to the specific line–plane topology rules and has better robustness. The work of this paper is also mainly to optimize the plane extraction results that do not conform to the building topology rules.
In addition, plane extraction plays an important role in the reconstruction process [
32]. For data with less noise, the implementation efficiency of region growth is very high. In contrast, the RANSAC algorithm, with a slower processing speed, has a good anti-interference ability to the noise. In addition, clustering algorithms and energy optimization algorithms can also be used for plane extraction [
33,
34]. Each of these algorithms has its advantages and disadvantages, and the segmentation effect depends on fine-tuning parameters, such as the distance to the plane and the area of the plane region. The work of Bouzas et al. [
23] takes the distance between triangular facet and plane to grow the region, and it deletes the plane region whose area proportion is less than the threshold value. The setting of these parameters requires a lot of experience, and the threshold setting varies from building to building. Fang et al. [
35] proposed a plane extraction method without manually setting the threshold parameters of distance and area, and they learned the threshold values of the distance and area of buildings with different levels of detail through supervised energy optimization. The plane extraction effect of this method depends on the learning of a specific training set, and it is not ideal for objects with complex structures.
To solve these issues, a convenient, efficient, and robust water-tight polygonal generation strategy is proposed in this paper, which aims to obtain the optimal model efficiently. The presented method requires few for the input data, and the single building mesh model obtained through manual segmentation or semantic segmentation could both be the data source. Even if it contains ground or vegetation regions, or even severe noise, the processing strategy can always generate a building model with a high degree of realism. In the presented pipeline, first, the triangulated network with the 1-ring patch is reconstructed, followed by the plane extraction and topology construction. Second, to solve the issue of relying on area threshold in the aforementioned methods, the topological importance is used to determine the existence of planes and correct the unreasonable topological relations. Finally, the binary linear programming method is used to delete undesired candidate faces to obtain the polygonal model, together with the efficiency-oriented optimization of the candidate face generation.
Our work has three main contributions:
A novel type of mesh model primitive based on the 1-ring patch is introduced.
A method of repairing topology relationships using the structural information of the building is proposed.
An optimized pipeline of building mesh polygonization is designed and performs with high efficiency.
2. Methodology
As shown in
Figure 1, the method in this paper mainly includes five steps: (a) converting triangular mesh model into the 1-ring patch model, (b) extracting plane regions based on region growing of the 1-ring patch model, (c) building and optimizing the topology of the plane regions, (d) generating candidate faces, and (e) selecting candidate faces to obtain the polygonal model. The framework of generating a polygonal model from the mesh model is derived from significant improvements to SABMP. In step (a), the proposal of the 1-ring patch primitive replaces the original processing unit for subsequent steps, typically step (b). In step (c), several well-designed rules are introduced into the topology optimization process, such as ground extraction, coplanar plane regions mergence, and adjacent parallel planes modification. Other minor improvements beyond SABMP include the use of the hierarchical index in calculating the 3-ring planarity in step (a) and the use of the divide-and-conquer split strategy in constructing the building scaffold in step (d).
2.1. The Generation of 1-Ring Patch Model
The choosing of model primitive type is a critical step for mesh polygonization. Usually, as the smallest planar primitive of mesh, the triangular facet represents the local geometric feature; thus, it is the most widely used primitive for mesh processing. Vertex position, vertex normal, and plane normal of triangular facets are important geometrics for the mesh model to perform filtering, segmentation, classification, etc. The fidelity of the triangular mesh is high when the mesh model is not noisy. However, the building mesh model generated from MVS contains lots of noise in many cases, and the mesh of the building surface that should be highly planar is rugged. In this case, a single triangular facet performs poorly in characterizing the features of the local region. Taking account of a better balance between feature expression ability and calculation efficiency, instead of a single triangular facet, a type of super face called 1-ring patch is used as the basic primitive for geometric feature expression. Our inspiration comes from the region-growth strategy based on the planarity of the k-ring neighborhood [
36]. A 1-ring neighborhood refers to the set of all vertices connected with a vertex by an edge in a triangular mesh, while 1-ring patch refers to the super face formed by all the triangular facets containing this vertex. As shown in
Figure 2a, the 1-ring patch (green region) consists of a center point (labeled in red) and its surrounding boundary points (labeled in orange), and its geometric features, which are more representative of local information, include the position, planarity, and normal of the patch. The patch model generation process is as follows:
Calculate the k-ring planarity of all the vertices of the mesh model and add the vertex with the greatest planarity to the list of new center points (labeled in red).
Construct the 1-ring patch of the center point. The 1-ring neighborhood points of the center point are taken as the boundary point of the patch, the centroid of the patch is calculated as its position, the normal of the center point is taken as the normal of the plane, and the planarity of the center point as its planarity. Remove the center point from the list.
The 2-ring neighborhood points (labeled in yellow and pink) of the center point are as added to the list of new center points (labeled in yellow). The point labeled as a boundary point (labeled in pink) in the patch generation process should be removed from the list.
Repeat steps 2 and 3 until the entire mesh model is traversed. The result is shown in
Figure 2b where patches are labeled in different colors.
In step 1, we adopt SABMP’s conclusion that ring neighborhoods of order 3 outperform other orders. To speed up the calculation of 3-ring planarity, we propose a strategy based on a hierarchical index. Different from SABMP and other traditional methods traversing the k-ring neighborhood of each vertex directly, our method constructs the index relationship of ring neighborhoods between high order and low order and avoids the repeated search of neighborhood points. As shown in
Figure 3, the k-ring neighborhood of a vertex can be represented by a set of the k-1 ring neighborhood, whose center point is the 1-ring neighborhood point of that vertex, and we can use the established hierarchical index to obtain all the k-ring neighborhood points.
As shown in
Figure 4, the final reconstructed surface model consists of inter-connective 1-ring patches and independent triangular facets that are not contained in the patches. Although it is a hybrid model represented by two types of primitives, we still call it the 1-ring patch model. The reason is that we want to emphasize the characteristics of this model and that the 1-ring patch primitive does occupy a dominant position compared to the independent triangular facet. Patches and triangular facets could share boundary points, and the adjacency between patches is determined by the 2-ring adjacency between their center points. Using the 1-ring patch to reconstruct a triangulated net, the building surface can be represented with fewer elements while keeping the original geometric information.
2.2. Region Growing
Restricted by the data quality of the MVS mesh, the planar structure of the building may be rugged, and the boundaries between different plane regions are difficult to distinguish. Therefore, based on the mesh model composed of 1-ring patches and triangular facets, a two-step strategy of rapid region growth is designed: region growth of patches and region growth of triangles (shown in
Figure 5).
The first step of region growth of patches is selecting the patch with the greatest planarity as the seed point. The growth plane is then initialized by principal component analysis on the vertices contained in the patch. For the growth judgment of its neighborhood patch, both distance and normal direction threshold are used to improve the segmentation accuracy. If the growth of the neighborhood of the seed point fails, our method degenerates the patch into several triangular facets and recalculates the distance between the vertices of the triangular facet and the growth plane and the angle between the normal of the plane and the normal of the triangular facet. It should be noted that, due to the different growth spans of patches and triangular facets, different normal direction thresholds (
anglepatch and
angletri) are adopted, and the triangular facet cannot be used as the seed point of patch region growth. The distance threshold
is a statistic that needs to be adjusted sometimes.
where
is the length of the edge
of the triangulated network,
is the number of edges in the triangulated network, and
is the adjustment coefficient, which is 1 by default.
After the traversal of all the patches, many patches, without joining to any plane region, degenerate into triangles. These facets generally present as some slender strips or debris, which cannot be ignored for their importance of structure expression, so it is necessary to carry out the secondary growth of triangular facets. A seed facet was randomly selected from these facets, and the growth was carried out according to a certain distance and normal direction threshold until all the degradation facets are added to some plane regions that must have enough faces (size threshold = 10).
The growth of patches is the main step of mesh segmentation, and the growth of triangular facets is the necessary supplement of patches growth. After the two-step growth, the main plane regions of the building are extracted as some sets of patches, and the connecting regions between the planes are precisely distinguished in the form of triangles. It should be noted that, within the plane region, the triangular facets that do not degenerate from patches will not grow as the initial seed, but it may be added to a plane region in the process of triangle growth, to consider the accuracy and efficiency of segmentation.
2.3. Topological Optimization
Since the neighborhood relations of 1-ring patches have been established during the remodeling process, the topology graph of the building plane structure can be easily constructed according to the plane region extracted from the regional growth. Since the mesh may be noisy and the topological relations of all the building planes cannot be guaranteed to be fully correct, we need to optimize the initial topology further. Topology optimization includes the deletion of redundant planes, the addition of missing planes, and the modification of the adjacency relation, which is mainly divided into three stages: (1) ground extraction, (2) the merging of coplanar plane regions, and (3) the modification of adjacent parallel plane regions.
2.3.1. Ground Extraction
Since our work focuses on the reconstruction of the single building model, extracting a single building from a large scene is beyond the scope of this paper. Related scholars have done a lot of research on building extraction, but it is still difficult to extract clean buildings in many cases [
37,
38,
39]. In addition, the polygonization algorithm requires an input of a closed model or a model surrounded by the ground to ensure the two-manifold. To make our approach more versatile, a step clustering the regions around buildings into a ground plane region is specifically designed for our polygonization approach.
In our research scenario, the whole mesh can be simply summarized into three main types: the ground, the building façade, and the building roof surface. The building façades are generally perpendicular to the ground, and in some cases, they will also be at a certain degree of an angle with the vertical direction. Roof surfaces of the building are generally divided into two categories, horizontal and oblique. In this paper, we focused on the common architectural forms, and some relatively unique architectural designs were not in our consideration. Therefore, according to the angle between the plane and the horizontal plane, they can be roughly divided into two parts, building façades and horizontal planes. Moreover, the candidate ground planes are extracted from the horizontal planes, according to their elevation difference with the centroid of the adjacent building façade (less than disground).
The candidate ground plane with the lowest elevation is taken as the seed plane, and other non-building planes are extracted through a standard region-growing process. In this paper, the planes that need to be taken as the ground plane can be divided into three main categories, and corresponding different strategies are designed to merge them with the seed plane:
Candidate ground plane. If the height difference between the seed plane and the candidate ground plane is less than the common floor height (hf), it will be labeled as the ground and merged with the seed plane to update its elevation.
Non-façade neighborhood plane. A normal direction threshold anglenonbd is used to determine whether the neighborhood plane was a non-façade plane. The non-façade plane region with a limited area (area < areanonbd) should be included in the seed plane. Since these planes are generally non-ground features, they do not participate in the update of ground elevation during the growing process.
Sub-ground plane. Planes that are lower than the seed plane or less than half a floor height (hf) above the ground are merged with the seed plane. These planes also update the neighborhood of the ground plane without recalculating the ground elevation.
As shown in
Figure 6, the ground region is extracted, and the number of plane regions to be polygonized decreases from 264 to 190, which could make the subsequent optimization simpler and more efficient.
2.3.2. The Merging of Coplanar Plane Regions
Since the regional growth is performed according to the adjacency and the artificial threshold value, some extracted planes are inevitable coplanar but not merged. It is easy to fail during the generation of the polyhedron model in this case, so these regions that are supposed to be in the same plane should be merged. SABMP [
23] traverses all plane regions and merges the approximately parallel plane regions with small spatial distance into a new plane region. And the entire traversal process is repeated until no new region is created. To improve the processing efficiency, a “merging the biggest” strategy is adopted so that the entire traversal process only needs to be executed once. The process is executed as follows:
- (a)
Sort the plane regions according to the number of patches, choosing the biggest plane region as the seed region to annex other regions.
- (b)
Traverse all the regions smaller than the seed region and calculate the angle and distance between two planes regions. The angle threshold anglemerge and distance threshold dismerge are used to determine whether a pair of plane regions are coplanar and should be merged.
- (c)
For a pair of coplanar regions, merge the smaller region into the seed region, otherwise choose the next largest plane region as the seed region.
- (d)
Repeat steps (b) and (c) until all the regions are traversed.
2.3.3. The Modification of Adjacent Parallel Planes
Since the plane obtained by our region growth is a directed plane, that is, the normal always point to the outside of the model, it is necessary to optimize their topological relations according to the orientation of the normal of the parallel plane. For adjacent parallel planes with the same normal direction, our method deletes their edge in the topology graph. Adjacent parallel planes opposite the normal vector are generally some long and narrow independent structures on the building, such as the parapet, billboard, fence, and so on. They occupy a small proportion of the space but are very important for the restoration of the real appearance of the building.
In a normal topology, there should be no adjacency between parallel planes. However, for some parallel planes with small spatial distances, regional growth is difficult to ensure that there will be no adjacency relationship between the patches or triangular facets. When they do not conform to the coplanar property, their topological relations must be modified to avoid structural degradation of the polyhedron model. In this case, an algorithm is proposed to correct the topological relationship by rebuilding the missing connection planes between parallel planes.
First, we selected the interconnected patches in two parallel planes for connectivity analysis, and a candidate region with the largest number of patches was selected after dividing them into several independent regions. Then, we selected the plane region with a smaller patch number from two parallel plane regions and calculated its bounding box. The patches of the candidate area were sorted in order from high to low, and
patches were selected successively as the face elements of the connection plane.
where
and
represent the bounding box range in the
x and
y direction, respectively. After the construction of the horizontal connection plane, add its adjacency relationship to the topology graph.
The vertical connection plane may also need to be rebuilt depending on the neighborhood of the horizontal connection plane. In addition to two parallel planes, the adjacency of the horizontal connection plane may also include the neighborhood of two parallel planes. If the normal vector of the neighborhood plane is horizontal and not inside the parallel plane bounding box, it is also considered to be a neighborhood of the horizontal connection plane. The number of vertical connection planes to be constructed is
, where
is the number of neighborhoods of the horizontal connection plane. A similar approach is taken to the horizontal connection plane, selecting
patches to construct the vertical connection plane along the horizontal direction, in order.
where
are the bounding box range in the height direction. The normal vector of the vertical connection plane is equal to the cross product of the normal vector of the horizontal connection plane and one of the two parallel planes. In addition to two parallel planes and the horizontal connection plane, its neighborhood should add the adjacent horizontal plane of the smaller parallel plane. As shown in
Figure 7, the direct connection of parallel planes can be avoided after the rebuilding of the connection plane. In the current stage, our work does not deliberately pursue the authenticity of the segmentation results, but the rationality of the topology structure is guaranteed.
In addition to the above important optimization measures, our work also performed other topology optimization work, such as deleting the region with less than three adjacent planes that cannot build a normal polygonal model.
2.4. Generation of Polyhedron Model
Based on the topology optimization, the number of nodes and edges of the topology graph was reduced considerably, and also the topology quality was improved. In the next step, taking the optimized topology graph as input, a polygonization process of improved efficiency is implemented with the consideration of the independence between plane regions. Building scaffold refers to the connection graph made up of the plane edges formed by the intersection of two support planes and the corners formed by the intersection of three support planes [
19]. According to the adjacency relation of the plane regions in the topology graph, it is easy to calculate the intersecting edges and points. The problem that some adjacency relations cannot be restored correctly in the topology graph causes some corners to not be able to be found. To obtain the correct building scaffold, SABMP traverses all line segments to check the coplanar relationship and intersection relationship between two pairs. For the edges that meet the intersection conditions but lack intersection points, the line segments are split from their intersection points, and the intersection points are added as new corners. Whereafter, all line segments are traversed over again. The schematic diagram is shown in
Figure 8.
A line segment has more than one support plane and the intersectant line segment lies on one of the support planes. In
Figure 8a, the blue face is the support plane of the black segment, while the green one is the support plane of the gray segment, and the orange one is the common support plane of the pair of line segments. The split of a line segment is the split of the plane region where the line segment lies and does not affect the line segments of other plane regions.
Therefore, a divide-and-conquer split strategy was adopted for each plane region. In the first step, this strategy established a subordinate relationship between the plane region and the line segment and then carried out a relatively independent split process for each plane region. In addition, all the checked line segment pairs were marked and did not repeat splitting. The pseudo-code of our strategy is shown in Algorithm 1. Under our improvement, the time complexity of the algorithm was reduced from
to
, where
is the number of splits,
is the number of line segments, and
is the number of plane regions.
Algorithm 1. Refine edges. |
Input: The set of edges of each plane Output: The refined edges and the final corners 1 while do 2 3 for do 4 for do 5 for do 6 if then 7 8 if then 9 10 11 if 12 then 13 14 15 split, split 16 17 18 go to line 1 19 end if 20 end if 21 end if 22 end for 23 end for 24 end for 25 end while |
Using the projection of the building scaffold into the planar regions, we could obtain the candidate faces for the simplified mesh, as shown in
Figure 8e. Considering the influence of false segmentation and the requirement of water tightness, a further step to determine whether a candidate face should be preserved was adopted using the method in SABMP.
After the optimization process, the simplification result of the model could be obtained, and the watertight and manifold characteristics of the polyhedron model were ensured.