1. Introduction
The three-dimensional (3D) façade model of urban buildings plays a crucial role in many fields, including urban planning, solar radiation calculations, noise emission simulations, virtual reality, sustainable development research, and disaster simulation [
1,
2,
3]. The automatic reconstruction of building façades has always been a significant research topic in the fields of photogrammetry and remote sensing, as well as computer vision and computer graphics; nevertheless, due to the intricacy of urban scenes, the automatic reconstruction of urban building façades is still a challenging task.
In past decades, a number of researchers have tried on the (semi-)automatic reconstruction of façade models for generating LoD3 building models [
4]. Images and LiDAR (Light Detection and Ranging) point clouds are two common data used for façade model reconstruction. Several methodologies aiming at the automatic reconstruction of 3D façade models have been established in the past years. Xiao et al. [
5] proposed a semi-automatic method to generate façade models along a street from multi-view street images. For this purpose, an ortho-rectified image was initially decomposed and structured into a directed acyclic graph of rectilinear elementary patches by considering architectural bilateral symmetry and repetitive patterns. Then each patch was enhanced by the depth from point clouds, which was derived from the results of structure-from-motion. Müller et al. [
6] suggested an image-based façade reconstruction approach method by utilizing an image analysis algorithm to divide the façade into meaningful segments and combine the procedural modeling pipeline of shape grammars to ensure the regularity of the final reconstructed model. Sadeghi et al. [
7] presented an approach for façade reconstruction from hand-held laser scanner data based on grammar. The method starts from using RANSAC method to extract façade points, and then protrusion, indentation, and wall points are detected by utilizing a density histogram. After that, façade elements are modeled by employing some rules. Edum-Fotwe et al. [
8] proposed a façade reconstruction method from LiDAR data; the algorithm employed a top-down strategy to split the point cloud into surface-element rails in signed-distance-field, then completed the façade model reconstruction. Pu and Vosselman [
9] contributed to an approach on integrating terrestrial laser points and images for façade reconstruction. The building façade’s general structure from the plane in LiDAR point cloud data was discovered and established, and then the line feature in the images was employed to refine the model and to generate texture. These methods obtained promising results, but they have to utilize the images or point clouds from the terrestrial ground. However, the lower part of the façade is commonly enclosed by various types of vegetation, street signs, cars, and pedestrians, and the obtained point clouds usually suffer from a large number of missing data [
10]. This issue may hinder the reconstruction of building façades. It is worth mentioning that TLS often acquires data only on the side of urban streets. The other façade data cannot be readily achieved, making hard to establish a comprehensive building façade model.
Along with the development of Unmanned Aerial Vehicle (UAV) and aerial oblique photogrammetry, it is possible to obtain a high-resolution façade image from UAV by an aerial oblique camera system, and then, a multi-view dense matching methodology is implemented to reconstruct and update the 3D model of urban buildings expressed by the photogrammetric mesh model. The automatic reconstructed model often contains millions of triangles, which brings an onerous burden for storage, web transferring, and visualization. Moreover, due to the problem of occlusion, repetitive texture, and transparent object, there are also some defects in the automatically generated mesh model, which could reduce the visual effects. Hence, some mesh editors such as DP-Modeler and OSketch [
11,
12] are developed to improve the mesh model by manual work. By this view, the main objective of this research work is to develop a method to reconstruct the regular façade model from the photogrammetry mesh model such that the structure of a single building is preserved. The reconstructed model is potentially employed for visual navigation, online visualization, solar energy estimation, etc.
The current façade modeling methods can be generally categorized into two major types. One is a data-driven method [
13,
14,
15,
16], while another is a model-driven method [
17,
18,
19]. There are several data-driven approaches proposed to reconstruct façades model from Airborne-Laser Scanning (ALS) data [
20]. The reconstruction is completed by vertically extruding the roof outline to the ground. Thereby, the key problem is the roof outline generating, which can be realized by edge-based methods [
21], data clustering [
22], region growing [
23], model fitting [
24], etc. Edge-based methodologies are susceptible to outliers and incomplete edges. The method of data clustering relies on the number of classes defined and the clustering center. The approach based on the region growing is usually influenced by the seed point selection. The RANSAC method is implemented in model fitting, which often results in unwanted false planes. Additionally, the accuracy of the reconstructed façade model based on the roof model boundary is susceptible to eaves. Wu et al. [
25] proposed a graph-based method to reconstruct urban building models from ALS data. This method was basically constructed on the hierarchical analysis of the contours to gain the structure of the building, then a bipartite graph matching method was employed to obtain the correspondence between consecutive contours for subsequent surface modeling. The final model heavily relies on the contour’s quality. If there exist some noise or artifacts in the point cloud as in the photogrammetric mesh model, the matching and surface modeling process in Ref. [
25] would drop the quality of the final model. Thus, it cannot adapt to the under-study photogrammetric mesh model. For data-driven methods based on the ground data, regularity of symmetry is often detected in the source data, and then exploited to regularize the final model.
Façades usually exhibit strong structural regularities, such as piecewise planar segments, parallelism, and orthogonality lines. Generally, model-driven methods employ this prior information about the face structure to constrain the façade modeling. Nan et al. [
26,
27,
28,
29] generated building details by automatically fitting 3D templates on coarse building models with texture. To this end, the 3D templates were produced by employing a semi-automatic procedure, emerging a template construction tool. The boxes were directly fitted to the imperfect point cloud based on the Manhattan-World hypothesis, and then the best subset is selected to achieve reconstruction. Larfage et al. [
17] proposed urban buildings reconstruction method by detecting and optimizing 3D blocks on a Digital Surface Model (DSM).
Since the mesh models based on the aerial oblique images often contain noise, herein, a model-driven approach is proposed. The façade of the under-study building is assumed to be composed of several cuboids. The photogrammetric mesh models are iteratively divided into various components from bottom to top by the segmented contour group. Subsequently, each component is fitted by a set level of cuboids, and then we will arrive at the final façade model.
The organization of the paper is as follows: In
Section 2, the proposed method for façade modeling is explained described in some detail. In
Section 3, the performance of our proposed method is evaluated through a scene of photogrammetric mesh model. In
Section 4, some discussions are provided. Finally, the main conclusions are given (i.e.,
Section 5).
2. Methods
2.1. Overview of the Approach
Generally, a given scene of the photogrammetric mesh model can be classified into façade mesh models of individually single buildings and others. The main goal of the proposed method in the present study is to automatically produce a 3D regular building façade from the photogrammetric façade mesh model (hereafter noted as photogrammetric façade mesh). The workflow of the proposed approach is displayed in
Figure 1. It mainly includes three parts in the following:
(1) Firstly, the photogrammetric mesh model is decomposed into components based on the contour line. The closed contours on irregular triangular networks are tracked, and local contour trees are exploited to find the segmented contour groups by analyzing the topological relationship between the contours of the photogrammetric mesh model. Subsequently, such a model is segmented from bottom to top into diverse components through an iterative process.
(2) The photogrammetric mesh model components are approximated by minimum circumscribed cuboids iteratively.
(3) The parameters of the cuboid model are adjusted by means of a least square algorithm to ensure the accuracy of the façade model.
2.2. Component Decomposition Based on Contours Analysis
Assume that the building façade is composed of several cuboids; hence, the first step is to recognize the façade component abstracted by a cuboid. To this end, the photogrammetric mesh model is divided into various parts by analyzing the topological relationship of contours, and then each component is distinctly reconstructed. Generally, the photogrammetric mesh model is segmented from bottom to top by a segmented contour pair.
2.2.1. Contour Segment Pair Generation
If point clouds are used, as in Ref. [
25], a linear Triangulated Irregular Network (TIN) interpolation method has to be performed firstly to obtain TIN. In contrast, the photogrammetric mesh model in the presented study is represented by a continuous TIN, the contour line tracking is directly performed on the TIN exploitation of the original data to avoid the loss of accuracy of data interpolation [
30]. For the contour lines tracking, the initial elevation Z is set as the lowest elevation of the photogrammetric mesh model for each building, while the contour interval D is set according to the vertical accuracy of the photogrammetric mesh model. Subsequently, each contour line is carefully tracked. In general, there would be two types of contours: open and closed contours. Only closed ones are retained for subsequent processing.
After producing the contour lines, a building can be represented by contour clusters abstracted by cuboids. To split the contour lines into separate parts, the contours are transferred to a graph-based localized contour tree [
25,
31]. The tree consists of a root node, several internal nodes (branches), and several terminal nodes (leaf). The closed contour is represented as a node in the structure, while the relationship between contours is denoted by an edge between the nodes in the tree-based structure.
The local contour trees are constructed from bottom to top based on the contour elevations. For instance, let us take into account a complex building as demonstrated in
Figure 2a. The local contour tree (
Figure 2b) is initialized by contour A1 with the lowest elevation as the root node. Then, the adjacent contour A2 is identified and added as the child node of contour A1. These steps are iterated until the highest contour B6 is included. During the adding process, when meeting
n (
n > 1) contours for a given height value,
n branches will be constructed.
Figure 2a shows that there are two contours (contour B1 and C1) for the fourth height value. Thus, two subtrees are generated from A3. In these trees, only the contours whose topological relations have not changed exist in the same structure. These contours are represented by a subtree in the contour tree. Finally, the contour tree illustrated in
Figure 2b is obtained, where the same color part indicates the same structure of the photogrammetric mesh model. Node A3 has two sub-nodes B1 and C1, and node C3 has a sub-node D1, indicating a separation relationship in the sense of topological representation. After producing the contour tree, the segmented contour pair is attained between subtrees. Therefore, the segmented contour pairs of the photogrammetric mesh model in
Figure 2a are A3–B1, A3–C1, and C3–D1.
2.2.2. Decomposition of Components
After generating the contours trees, the photogrammetric mesh model is subdivided to mesh clusters based on the obtained segmented contour pair. For the local contour tree as shown in
Figure 2b, firstly, the lowest elevation contour pair A3–B1 (
Figure 3a) is exploited to remove the triangles placed between the contours A3 and B1. Then, the remaining triangles are clustered into three components of the photogrammetric mesh model. As demonstrated in
Figure 3a, the gray part of the model, which is lower than the A3–B1 elevation of the segmented contour pair, is successfully segmented. Thereafter, the components of the photogrammetric mesh model are subdivided, which are higher than the elevation of the segmented contour pair A3–B1. Due to the lower elevation contour of the next group of the segmented contour pair with the lowest elevation A3–C1 is the same as those of A3, this segmented contour pair (A3–C1) is then skipped. Subsequently, the remaining cluster by the next segmented contour pair C3–D1 is subdivided, then the yellow component of the photogrammetric mesh model (see
Figure 3a) is successfully segmented. This process is repeatedly carried out until there is no segmented contour group, and then, the original photogrammetric mesh model is subdivided into basic components. The final obtained results are illustrated in
Figure 3a.
During the component decomposing, the triangles between different trees are removed, resulting in a gap between the subsequent generating models (i.e., the gap between A3 and C1 in
Figure 3a). To resolve the aforementioned issue, the elevation of the closest point to the segmented contour pair in the photogrammetric mesh model component should be appropriately reformed to the average elevation of the segmented contour pair. The photogrammetric mesh model components after the points’ modification are presented in
Figure 3b.
2.3. Cuboid Abstraction
After decomposing the photogrammetric mesh model into separated components, a set of cuboids is exploited to fit each component. At first, a region growing method is applied to the current component mesh model to produce super-facets. Then the least square algorithm is utilized to fit the normal vector of the largest super-facet. After that, the coordinate axis is transformed to the calculated normal vector, and the coordinates of mesh vertexes are centralized to lessen the subsequent iteratively minimum circumscribed cuboid fitting process.
To reconstruct the complex building model, the mesh model components are abstracted to several levels of minimum circumscribed cuboids. The cuboid abstraction performs iteratively, as the corresponding workflow is shown as
Figure 4.
The abstraction starts from popping one component from the separated components. If no components are left, the abstraction result is exported to the following processing; otherwise, an iteratively robust cuboid fitting process is performed. For the current component, the first level circumscribed cuboid is fitted to most outside of the component. There could be some noise in the original photogrammetric mesh model. For instance, if all points are exploited to fit the façade model, bias may exist in the façade parameters. A robust fitting strategy is proposed to eliminate possible noise points. Firstly, the distance between each point to the closest plane of the fitted cuboid is calculated. When the distance is larger than a given threshold value of
Td (
Td is experimentally set equal to 0.2 m in the present study), the points are removed, and the remaining ones are utilized to fit a new plane again for the corresponding side of the cuboid. By taking
Figure 5 as an example, it can be seen in
Figure 5a (the top view of model component) that there is one protuberance on the north side. The original cuboid is a rectangle with green color, which does not well fit to the point cloud. After removing the possible noise part, a new cuboid is fitted and marked as yellow color in
Figure 5b, and the fitted model snapped the point cloud well.
After generating the first level of cuboid, the average distance between the vertexes of a triangle to the nearest plane of the circumscribed cuboid is evaluated. The points whose distances are larger than a given threshold value
T2 (
T2 is experimentally set equal to 0.2 m) are grown to gather the non-overlapping regions to region groups. The regions with values less than a predefined value on the vertex number and areas are overlooked, and the predefined value of the vertex number and areas are determined according to the target detail of the model. For the remaining groups, robust cuboid fitting processing is performed to derive the next level of cuboids. After generating the next level of cuboid using the remained non-overlapping region, there will be a slight bias from the previous level of the cuboid. In
Figure 6a, the top view illustrates the whole process since the façade is vertical to the ground. As observed, the corner of the current cuboid is not on the first level of the cuboid. To avoid this problem, as shown in
Figure 6b, the coordinates of the current level of the cuboid are extended to intersect with the nearby cuboid sides, and the new intersect point will be used to replace the original cuboid corner to guarantee the close of the model.
The same procedure is repeated until there would be no non-overlapping region in the current component. Further, the component fitting process is repeatedly carried out until no component is left.
By taking into account
Figure 7a as the input component,
Figure 7a–e illustrates the process of the iteratively cuboid fitting process step-by-step. If a minimum circumscribed cuboid is directly fitted to the original model component, the first level of the circumscribed cuboid is produced (i.e., the green cuboid in
Figure 7b). The cuboid does not sit adjacent to the original mesh very well. By removing possible noise points or small objects on the façades, the remaining points would fit the model as displayed in
Figure 7c. It appears that the cuboid is closer to the original model after these modifications. For the non-overlapping from
Figure 7c, two second levels of circumscribed cuboids are derived, as demonstrated in
Figure 7d. All the cuboids are joined together when there are no overlapping areas, as presented in
Figure 7e.
2.4. Parameter Adjustment of Cuboid Model Based on the Least Square Method
As the initial cuboid is attained by a range of transformed coordinates, the resulted cuboid may not fit the initial photogrammetric mesh model very well because of existing noise and noise in the coordinate transform parameters. Thus, the least square method is employed to adjust the cuboid fitting the model to the initial photogrammetric mesh model. Each cuboid can be specified by six parameters (
), as the façade only considers the plane coordinates,
and H are kept fixed throughout the adjustment process. The adjustment of the model parameters is commonly accomplished by minimizing the distance between the initial model (i.e., the results of the cuboid abstraction process) and the photogrammetric mesh model by a least square algorithm. The adjust mode is defined as Equation (1).
where
denote the adjusted cuboid parameters,
represent the initial cuboid parameters, (
,
) are the coordinates of the vertexes of the involved triangles.
After obtaining the error equations, it can be solved by implementing the traditional least square approach. The error equations associated with Equation (1) is formatted in the matrix form as follows:
in which:
After obtaining the error equations displayed in Equation (2), the solution for the unknowns is completed in the following form:
For the first level of the cuboid, four model parameters () are adjusted. For other levels of cuboids, only the error equations pertinent to sides over-lapped with the initial photometric mesh model are adjusted, while other parameters are kept fixed. After adjusting the cuboid parameters, there would be some gaps between the subsequent level of the cuboid and its former level, the low level of the cuboid is shifted to the nearest high level of the cuboid.
After performing the adjustment process, the existing planes are chosen from the cuboid and employed to producing the final façade mode.