A Flexible Inference Machine for Global Alignment of Wall Openings

: Openings such as windows and doors are essential components of architectural wall surfaces. It is still a challenge to reconstruct them robustly from unstructured 3D point clouds because of occlusions, noises and non-uniformly distributed points. Current research primarily focuses on meliorating the robustness of detection and pays little attention to the geometric correctness. To improve the reconstruction quality, assumptions on the opening layout are usually applied as rules to support the reconstruction algorithm. The commonly used assumptions, such as the strict grid and symmetry pattern, however, are not suitable in many cases. In this paper, we propose a novel approach, named an inference machine, to identify and use ﬂexible rules in wall opening modelling. Our method ﬁrst detects and models openings through a data-driven method and then reﬁnes the opening boundaries by global and ﬂexible rules. The key is to identify the global ﬂexible rules from the detected openings, composed by various combinations of alignments. As our method is oblivious of the type of architectural layout, it can be applied to both interior wall surfaces and exterior building facades. We demonstrate the ﬂexibility of our approach in both outdoor and indoor scenes with a variety of opening layouts. The qualitative and quantitative evaluation results indicate the potential of the approach to be a general method in opening detection and modelling. However, this data-driven method suffers from the existence of occlusions and non-planar wall surfaces.


Introduction
In the creation of 3D building models, various techniques are being used, mainly based on images and active scanning point clouds [1][2][3][4][5]. Windows and doors, which are often substituted by the term "opening structures", define building facades and interior wall surfaces but often do not exist on the 3D models created by the existing automatic modelling methods. With the rapid growth in the field of smart cities, digital twins and building information models (BIMs), the need for opening detection and modelling is also increasing to enrich exterior and interior wall surfaces of 3D building models. Recently, under the Virtual Singapore project, the efficient facade modelling was investigated as one of the focal points [6,7]. The integration of openings improves reality-based visualization and simulation through augmented reality (AR) or virtual reality (VR) for decision making, but also analyses of building energy performance [8,9] and many other applications. A reliable reconstruction of openings offers a higher level of reality in digital models and superior experience to users and provides convincing data for 3D analyses, i.e., glazing ratio determination, wayfinding, 5G signal simulation [10], etc.
Opening detection generally relies on an exploitation of facade layout. Facades with regular layout presented by a strong structural pattern, such as a strict grid or an otherwise symmetric pattern [11], can be reconstructed with predefined libraries of templates following the predefined or observed regularity constraints [12,13]. However, windows and doors are not always regular or following specific architectural layouts. Diverse needs in natural lighting, energy consumption and considerations of construction costs influence the opening allocation, resulting in various structures on wall surfaces. As a matter of fact, most designs usually follow certain rules. Wall openings are aligned to each other in certain patterns. A grid pattern can be deemed a combination of these flexible rules including left, right and center alignments. Nevertheless, there is a lack of generality in formulating such rules [14].
In this work, we propose a flexible inference machine to model building openings with various layout patterns. The method can model grid patterns with much less strict constraints. As shown in Figure 1, the pipeline starts from detected wall points extracted from Light Detection and Ranging (LiDAR) point clouds. The openings are detected by an α-shape algorithm for each wall, and their initial reconstruction is done by fitting four boundary points allocated at the edge of a rectangle. The initial opening boundaries are applied to infer their associations and alignment rules, which are used to globally refine the opening geometry. The main contribution of this work is that we propose a set of flexible rules to refine the opening geometries globally, resulting in the fact that our method adapts to more general situations than the current state-of-the-art methods.

Related Works
Wall openings are usually holes represented in LiDAR data. The laser beam returns from the concrete wall, but passes through open spaces and glass surfaces, such as doors and windows. Based on this feature, a lot of work has been done on opening detection and modelling [15].

Hole Detection
The detection of holes in high-dimensional space is typically executed with a transformation to a lower-dimensional domain in order to exploit successful algorithms such as feature detection and classification in the 2D domain [16]. A 2D binary occupancy raster map is frequently used in differentiating empty and occupied space [17,18]. Relying on a given planar surface, an analysis of occupancy around points conduces to an extraction of boundary points. Lines estimated from the boundary points are used for 3D decomposition of a facade with later transformation into a binary map. By employing a classification method, the shape of openings can be approximated [19].
However, pixel-wise classification needs several parameters and may lose geometry information during voxelization and reprojection. Based on points, void areas can be extracted through the endpoints of the projection line from slices of a facade [20]. To geometrically reconstruct facade elements, Pu et al. [21] proposed to extract long Triangular Irregular Network (TIN) edges which have a higher probability in representing the boundary of openings from the triangulated surface. In addition, with the information from Delaunay triangulation, the α-shape [22] can approximate shapes of holes from unstructured point cloud data. It introduces α-extreme points and α-neighbors of a point set according to an α-value which determines boundary points and inner polygons of void areas [23]. Nonetheless, shapes detected by these approaches are vulnerable to noisy points and outliers in sophisticated indoor and outdoor scenes which may differ in real-world cases. Although holes are detected individually, their spatial relationship is still missing. Our approach detects holes as initial openings with shape simplification and fitting. Then it infers the associations and accordingly refines the geometry of openings.

Rule and Structure Based Reconstruction
Rules play an important role in building facade element detection and reconstruction. They are also indications of the spatial relationship between objects. One of the typical assumptions is that facade elements are arranged by a rectilinear grid. Müller et al. [12] propose a grammar that splits a facade into floors and encodes repetitive elements according to the facade structure on an image. Becker [24] proposes an approach to automatically generate a formal grammar for facade modelling based on 3D cell decomposition. Wu et al. [25] introduce the inverse procedural modelling that extracts splitting grammars to explain a given facade layout. However, these techniques are based on a global rectilinear grid [26]. They might be subject to the fact that the layout is not in rows and columns.
Several approaches mitigate this problem based on the detection of symmetry and structural regularity [11,27]. Shen et al. [28] propose to partition the facade with the adaptive determination of the splitting direction, with the number and location of splitting planes according to an interlaced grid. Mesolongitis and Stamos [29] propose a voting scheme to detect windows with the assumption that they are arranged in periodic structures. Li et al. [8] use a sliding window method to analyze the facade structural regularity for detecting corners of openings. Similarly, Nan et al. [30] detect the repetition of a facade from images that guides an assembling of 3D templates and a coarse model for a detailed building model reconstruction. Furthermore, an interactive approach, named SmartBoxes [31], allows users to symmetrically generate detailed facade models with templates and pre-specified alignment rules. However, these assumption-based methods mostly focus on the condition of symmetrical and repetitive patterns.
To increase the flexibility of object detection and pattern recognition, Lafarge et al. [32] propose a marked point process with the reversible jump Markov Chain Monte Carlo (rjMCMC) framework on 2D images. It estimates the maximum a posteriori configuration of shapes of objects with a given geometric library including rectangle, circle and so forth [33]. With consideration of the relative neighborhood of graphs and the pairwise constraints, regular patterns on a street-level facade image can be recognized [34]. Structures such as lattice and grid are proposed to MCMC sampler as constraints to the detections [14]. However, this pixel-wise framework is hardly used in 3D point clouds and does not cover the global layout of detections.
In 3D space, the semantic classification, partition, and architectural rules such as alignment and symmetry compose a comprehensive framework for semantic facade analysis [35]. However, assuming architectural principles such as a grid to restrict the location of windows is still subject to asymmetric layouts of wall surfaces, particularly building facades. Recent rule-based methods rely on a certain type of explicit knowledge which can hardly adapt to general cases. In contrast to the use of strict rules, our approach detects flexible rules which can be used to reconstruct the original layout of openings.

Overview of the Proposed Approach
Active sensing technologies, such as LiDAR and integrated mobile mapping systems, have been successfully adopted for fast acquisition of the detailed information of both interior and exterior walls. Generally, from the representation of 3D point clouds, a common way to detect architectural facade openings such as windows and doors is to identify the holes. They are represented as holes which are caused by the piercing through penetrable materials, i.e., glass, or by the segmentation from a certain depth to a geometric primitive, such as a planar surface. To identify these areas, a detection method usually uses prior knowledge of patterns on wall surfaces. In the current literature, the repetition of a facade layout (e.g., strict grid) is frequently employed in various detection methods [8,30]. However, there is still a lack of generality of layout recognition for less strong patterns than grids, such as the facade in Figure 1.
We want not only to detect the openings, but also to reconstruct the geometry by its original layout. It corresponds to identifying the internal laws of the global layout. We first detect and reconstruct openings. Then by analyzing their distribution rules, the corresponding openings are correlated to infer the overall layout. Based on the rules and layout, we have more flexibility to optimize the geometry of openings. A simple prototype of this inference machine is summarized as Algorithm 1. The openings are extracted by the α-shape algorithm on planar surfaces and simplified to rectangles as initial models. Opening associations are identified based on initial opening models and applied as alignment rules to refine the initial models. The association and alignment are recursively processed until no new association can be found. The remaining parts of this contribution are organized as follows: Section 4 elaborates our approach for the detection and initial reconstruction of openings. In Section 5 we discuss the mechanism of grouping and optimization of openings by the inference machine. We first detect the alignment rules from the initial openings. By assembling these flexible rules, the associated openings are grouped, and their geometry is adjusted accordingly. In Section 6 we illustrate the qualitative and quantitative results of the refined reconstruction of openings on synthetic and laser scanning point cloud data. We then draw conclusions and discuss the future work in Section 7.

Wall Surface Detection
We assume that every wall surface can be represented by a plane. To reconstruct wall openings, we first determine the wall plane from the unstructured wall points. Planar surfaces are detected by a region growing algorithm proposed by Vosselman [36]. A simple classification algorithm is performed to identify wall surfaces from the detected planar surfaces. In the classification step, a support vector machine (SVM) classifier is applied, and the area, width, length and zenith angle of the planar surface are used as features. Based on the wall planes, we define the relative horizontal and vertical reference directions by introducing the plane in the Cartesian coordinate system. The reference direction changes according to the parameters of the input plane. We define a geometric plane N as input wall surface data using a standard plane parameterization and the corresponding relative coordinate system S r . They are expressed as: wheren is the unit normal of the plane N and d is the scalar represents the closest approach of the plane from the origin. The plane N parameterized in space R 4 has 4 degrees-of-freedom.n f is the unit normal of the detected wall surface.n h denotes horizontal direction of the relative coordinate system S r , which is defined as the cross product of normal vectors of the wall and the horizontal planê n h =n f × (0, 0, 1).n v =n f ×n h represents the relative vertical direction of the S r .

Boundary Extreme Point Detection
Given the above definition, we transform points in 3D space R 3 to the plane of the wall surface in 2D space R 2 by projecting all points onto the basis plane with the projection function π f (·) : R 3 → R 2 . Then, we exploit the α-shape algorithm to extract candidate boundary points of each opening. In general, it is a method of abstracting the outline of a point set. Let us assume that there is a finite set of discrete points S. In the point set S, we generate a circle with a given radius of an α value through any two points p 1 and p 2 , {p 1 , p 2 } ∈ S. If there are no other points in this circle, the points p 1 and p 2 are considered to be boundary points, and the connection line of p 1 and p 2 is the boundary line segment. The outline can be extracted by iteratively detecting such circles. The resulting 2D polygon Ω is given by n anticlockwise vertices v n , Ω = {v 0 , v 1 , ..., v n−1 , v n = v 0 }. Due to the noises and uncertainties of points, the initial detection of border points conducted in an irregular shape (as shown in Figure 2) can be hardly used for the inference process. We use a rectangular shape to represent an opening. Therefore, we need to determine the four boundary lines of the rectangle by the border points. This can be formulated as a bounding box search problem. To solve this, we identify extreme points from the point set in four directions, respectively. This is to search the corresponding maximum and minimum points along with the horizontal and vertical directions. The projection from a 2D space R 2 to a line l, which is defined as π d (·) : R 2 → l. Since the size of polygon Ω is uncertain to each opening, we use binary search to locate the maximum and minimum points on each projection line. The extreme points M of the polygon are formulated as: where d denotes the direction of the projection line. v represents the vertices of the 2D polygon Ω.

Opening Edge Fitting
Due to light occlusions or noises, the initial extreme point set P m = {p m 0 , p m 1 , p m 2 , p m 3 } ∈ M are hardly used for a proper presentation of an opening, where p m i , i = {0, 1, 2, 3} represents the initial extreme points. To alleviate this, we introduce an objective function E f for opening edge fitting.
The objective of this function is to optimize the extreme point set P m to an optimal point set P m for fitting the edge of the initial opening based on the raw point cloud. Since the initial extreme points p m i ∈ P m , i = {0, 1, 2, 3}, are supposed to be the maximum bounding points of the corresponding opening, the optimum shape should be equal to or within the current configuration. To locate the optimum border points p m i ∈ P m , i = {0, 1, 2, 3}, we shrink the parametric frame towards the geometric center of opening. The shrink process is an adjustment of the extreme points p m i , which is expressed as p m i = p m i + t ini .n i is the moving direction which is depend on p m i . t i is the moving distance and i = {0, 1, 2, 3} corresponds to the four extreme points. The fitting process is subjected to the function E f including the point density E p and the distance E l terms. It is expressed as: where p m i denotes the adjusted extreme point. ξ(·) is a function measuring the neighborhood point density of the edge where p m i lies. t i is the moving distance and i = {0, 1, 2, 3} corresponds to the four extreme points.
The first term E p measures the neighborhood point density ξ(·) = P of the edge which consists of the adjusted extreme point p m i and the corresponding reference directionn i ,n i ∈ {n h ,n v }. The density P is subjected to a constraint of a neighboring point distance d < . is an empirical value which is set to half of the α value. The second term E l is similar to the polygon smoothing in O-Snap [37] that prevents the bounding edge from over deviation from the initial position. It measures the square of the moving distance t i from the points p m i to the adjusted points p m i , i = {0, 1, 2, 3}. The function avoids over shrinking from the original configuration of objects. If we want the edges of an opening fit to the raw data, the extreme point p m i should be located on an edge that contains an optimal number of neighborhood points with minimal movement from the initial extreme point p m i . As the number of openings and models is generally small, we minimize the optimization function via Best-first search [38].

Opening Reconstruction
Once we finalize the optimal set of extreme points P m = {p m 0 , p m 1 , p m 2 , p m 3 }, the configuration of the object is basically approximated, such as the location, orientation and initial topology. As we decide to reconstruct the object in a rectangular shape, the next step is to determine the corners of the rectangle. Since the positions of the bounding points are optimized, this is simply done by intersections of lines which are defined by the reference direction (as discussed in Section 4.1) and the optimized extreme points p m i . Please note that the reconstruction is anticlockwise, starting from the left lower corner (e.g., the intersection of left and bottom edge) of the rectangle.

Global Alignment by Flexible Rules
As discussed in the previous sections, wall openings are initially reconstructed in rectangular shapes. However, their geometric models are likely to deviate from the global layout of openings because of spurious objects, noises, occlusions and non-uniform density of input point clouds.
The initial reconstruction results should be adjusted by alignments according to context information.
Here we introduce a set of flexible alignment rules to globally adjust the wall openings. The opening layout can be decomposed into a set of more rudimentary patterns, each containing alignment rules (as shown in Figure 3). Therefore, the global alignment consists of two steps, opening association and geometry refinement. We first infer the alignment rules from the initial models of wall openings and then use the alignment rules to globally refine the model geometry accordingly. The pseudo-code for the pipeline is shown in Algorithm 2. It briefly explains how the pipeline works, and the details refer to the following sections.

Algorithm 2: Opening Association and Alignment
Input : initial reconstructed openings Output : refined openings

Flexible Rules
A layout of wall openings can be described as a composition of a set of flexible rules. For a simple example, a strict layout structure as a grid pattern can be decomposed into arranged rows and columns. These rows and columns are the flexible rules conducting the opening layout. We demonstrate this idea with a non-grid layout as shown in Figure 3a. Such layout can hardly be represented by a simple grid. In contrast, the listed flexible rules show higher flexibility in the reconstruction of the global layout. For simplicity and clarity, we only use alignment rules and do not consider symmetry.
The alignment rule includes top, middle, bottom, top-bottom, top-middle and bottom-middle alignment in the horizontal aspect. As shown in Figure 3b, the initial openings, labeled as 1, 2 and 3, are top-aligned, meanwhile satisfying the middle and bottom alignment rules. Apart from these, we also take frequent occurrences that are of top-bottom, top-middle and bottom-middle alignment into account. Vertically, the alignment rule includes left, right, center, left-right, left-center, and right-center alignment (see Figure 3c). For example, openings 2 and 5 obey left, center and right alignment rules. It is the same with openings 9 and 15, 12 and 18, 3 and 6, 12 and 18 and 9 and 15. If we look at 9 and 12, the right edge of opening 9 aligns with the left edge of opening 12. It is known as left-right alignment and vice versa. Left-center represents the alignment of the left edge and center of openings, e.g., openings 3, 6, 12, 18. Similarly, openings 3, 6, 9 and 15 follow the right-center alignment rule. By such analysis, the layout in Figure 3a is represented by a set of flexible rules as shown in Figure 3d. Openings are allowed to appear in different rules. For instance, opening 2 and 5 are under the left, center and right alignment rules while being in the left-center and right alignment with openings 8, 11, 14 and 17 (see the red frame in Figure 3d). Since openings are associated with rules, the assemblage of these rules shows a strong flexibility to present various facade patterns. As illustrated in Figure 3e, grid pattern and symmetry are the compositions of flexible rules. These rules are automatically detected as discussed in the next step.

Opening Association Detection
We use the nodes of openings for the rule detection, including vertices Ψ = {v j | j = 0, 1, 2, 3} and the center point c. These nodes are on the geometric surface defined in the reference coordinate system S r . Let X = {x 0 , x 1 , x 2 , ..., x n } be the configuration set of initial openings and n is the number of openings. For each opening } denotes the vertices of x i and i is the index number of openings. We first index potential alignment clusters from points in X. The nodes of the set of openings are projected to arbitrary horizontal and vertical lines using the projection function π d (·) : R 2 → l. d is the direction of the projection line. The projected points in the point set P d will show a cluster distribution on the projected lines, because initial openings are simplified to rectangular and they have a certain width and spacing. These clusters roughly indicate that the internal points are linear (see Figure 4). To cluster the projections, we use Euclidean clustering [39] a and b are the labels of the clusters. It states that if the minimum distance between a set of points p a ∈ P d and another set p b ∈ P d is larger than a given distance value d th , then the points in p a are labeled as a point cluster π d (X) a and the ones in p b as another distinct point cluster π d (X) b . We set the distance threshold d th as the double of the mean point spacing. The initial alignment rules can be inferred from the retraced points in each cluster.  Let's assume L = {l 0 , l 1 , l 2 , ..., l m } be a set of detected flexible rules from the nodes of openings belong to the clusters, where m is the number of the detected rules. Each element l ∈ L contains a candidate group of openings. The candidate group G is preliminarily divided according to the distance d th between the vertices v x of the opening x that are clustered into the same cluster and the fitted rule, However, the group G is a set of openings which is likely to mix with non-homologous alignment. Nodes on a line does not mean the corresponding openings are aligned with the same rule as it might contain several types of alignment, such as center-left and right-left alignment (as shown in Figure 5b,c). We preserve the current association and subdivide the group into several subsets on this basis. The searching of alignment is done by a Best-first search with a grouping function E g (G) as the following: where λ 1 and λ 2 are the weight parameters for balancing the consistency term E c (G) and the reliability term E r (G). The first term E c (G) introduces the consistency of neighboring openings. It is determined by the vectors of the edges that point to the center of the opening. If orientations of the vectors are similar, corresponding openings are likely to align with the same rule. We use dot products to determine the consistency of vector n i (w.r.t n j ) pointing from the edge e i (w.r.t e j ) to the center c i (w.r.t c j ), where i, j denotes the index number of openings, x i , x j ∈ l m . This is formulated as: where k is the number of openings in the candidate group. The second term E r (G) measures the reliability of the subgroup. We measure the distance dist between the candidate edge e i and the rule l (see Figure 5 left) which indicates the grouping tolerance. e i is the candidate edge of the openings in the candidate group which is in the rule l, e i ∈ x i , x i ∈ l. The distance dist is measured by the average distance of the vertexes of the edge e i and the rule l, where i = {0, 1, ..., k} denotes the index number of openings. This is expressed as: We aim at the search of openings with similar attribution. As demonstrated in Figure 5, if the dot product of two vectors n i and n j is positive and approaches to 1, meaning two edges are aligned, then E c (G) is set to 1. Interrelated objects are classified into a preliminary group. In contrast, it becomes −1 and split entities into two different sets. For the center alignment, the value of E c (G) is close to 0 when center-left, center-right, as well as center-top and center-bottom alignment happen. In such cases, it will create a new provisional group for the center alignment. Empirically, we set Openings in each subset which is associated by a rule are homologically aligned; however, there are likely repetitions of these openings in other rules. Taking the vertical direction as an example, a rule may include three alignment conditions: left, right, and center. Among the openings with the left alignment, a part of them meanwhile fulfil the center or right alignment conditions in other rules. The repetitions are fused to extract highly reliable groups of openings. If two or more openings satisfy more than two sets of homological alignments, these openings are reliably associated. By matching the elements in the subgroups under different rules, we can obtain a series of related openings that meet the condition. We use this set of stable spatial relationships in the next step to refine the associations and the allocation of openings.

Snapping
After the associations are detected, we could use the inclined rules to refine the opening geometry by snapping openings. The current opening association should be inclusive in the initial set of groups on the rules. Nevertheless, there might be errors in subgroups under a rule. This is possibly caused by the wrong point cluster or sectionalization of subsets. As we know, the extracted opening associations are more reliable since they are composed by at least two rules. Therefore, we use the reliable association with differentiate the missing or redundant openings in each rule. Redundant openings in the subgroup will be separated and traversed through the reliable association. If no matching could be found it will be labeled as an individual element, but still associated by the rule. On the other hand, missing openings are detected and adopted by the subgroup. By that, subgroups are unified and maintained according to the alignment rule.
The flexible rules for globally refining the geometry of openings are more elastic and simpler than the refinements with strict rules. It is because the flexible rules are finer-grained than the strict rules, such as the grid pattern. Geometrically, the distance between corner points of such openings and a rule should be as close as possible. Hence, we can simply adjust the geometry of openings in an association by calculating the deviation between each of them and a selected representative opening. The selection is based on the frequency of occurrence of an opening over all the rules. The higher number it obtains, the more stable it is. Then, we use the horizontal and vertical orientation as moving directions and compute the moving distances between the edge of openings in a subset and that of the most stable opening. After that, the edges of openings are snapped with the moving direction and distance. Finally, the reconstruction satisfies the global layout while properly describing the inherent geometry of wall openings.

Datasets
The proposed method is tested on both outdoor and indoor point cloud data acquired by several techniques including mobile and static terrestrial laser scanning. Table 1 lists the specifications of the point clouds used in this paper. To evaluate the flexibility and adaptability of the proposed approach, we first test it on synthetic facade samples. Since we want the test examples to be as close as possible to the real-world acquisition, we manually create four facade point clouds based on real captures from a mobile mapping system (MMS) and a distinctive facade layout from reference images (see Figures 6 and 7). These synthetic facades are 13 meters long and 1 m wide. Their (synthetic (a), (b), (c) and (d) as listed in Table 1) heights are 32, 24, 14 and 17 meters respectively. The level of clutter in these point clouds is low because there are few occlusions and the points mostly belong to the facade. Figure 6. Reference images of distinctive facade layouts for the synthetic data (a-d). Refer to Table 1 for more detail information about the synthetic facade point clouds. Moreover, we apply the approach to two captured data sets. A facade of a residential building on the National University of Singapore (NUS) campus, known as KFH in Table 1, is captured by a mobile mapping system. It contains 161,470 points. Its length, width and height are 34 m, 26 m and 30 m, respectively. The level of clutter is moderate. OKA stands for the main building of Old Kallang Airport in Singapore which is about 1.2 million points acquired from a single station by a static terrestrial laser scanner. The dimensions of the data are 77 × 53 × 26 m. The level of clutter is high due to the presence of people during the survey.

Indoor Data
The ISPRS benchmark data [40] is adopted to evaluate the proposed method on the indoor data. The indoor point cloud data known as TUB1, TUB2, UVigo and UoM were acquired by a mobile mapping system with the dimensions of 42 × 15 × 3 m, 41 × 16 × 11 m, 42 × 30 × 12 m, and 27 × 18 × 15 m respectively. The Fire Brigade is captured using terrestrial laser scanning with a Leica C10. Its dimensions are 54 × 14 × 10 m. The complexity of the data is already defined with the level of clutter where TUB1 and TUB2 are at a low level. UVigo and UoM are defined at the moderate level. The most complex scene in this dataset is the Fire Brigade which contains furniture, curtain walls and other occlusions. It has 14.1 × 10 6 points and at mean point spacing of 0.011 m.

Results
We apply the flexible inference machine to both outdoor (as shown in Figures 7-9) and indoor scenes shown in Figures 10 and 11. There are three main parameters used in the approach: the α value in the α-shape method in the initial opening reconstruction stage; the distance threshold dt of the projection point cluster in the detection of rules and the distance threshold d in the opening association. We determine the parameters based on the point cloud density and the mean point spacing (MPS) of the point cloud data. For the distance threshold of clustering dt, we set dt = 2 × MPS. The distance threshold d in the association is equal to the MPS. We initially set the α value to be 6 times the MPS. Empirically, we find that α = 0.18 m applies to the test samples in this work.  The machine infers an association for those openings which consist of at least two homologous alignment rules, e.g., both are top and bottom aligned as the most realistic cases. Generally, many openings in the horizontal association are also members of the vertical set. These shared elements provide connections to multiple associations. These relationships are marked in red lines. Figure 8 shows the facades of a residential building captured by a mobile mapping system. The data is incomplete, as most of the realistic MMS data will be occluded by trees and other surrounding objects. We observe that the basic structure of the facade, i.e., boundary points of openings, remains in a satisfactory quality for opening extraction. Since the point density is sparse in this data, we need to release the neighboring point distance for the algorithm for obtaining proper extreme points. Furthermore, each red line shown in the figure threads an association of openings which is inferred by the proposed flexible rules. For instance, the first row of the openings belongs to a group, and the third column openings are grouped. It illustrates that our flexible alignment rules can reconstruct the layout and structure of wall openings and simultaneously retain the spatial relationship of the objects.
The facade openings in static terrestrial LiDAR point clouds in Figure 9 are successfully reconstructed. Please note that the machine infers associations and accordingly updates the geometry of openings (see the upper left of Figure 9), whereas the occlusion in the raw data influences the final alignment (as shown in the lower right of Figure 9).

Indoor Scenes
Our pipeline is also applicable to indoor scenes. We demonstrate this on the ISPRS datasets which are shown in Figures 10 and 11. For the sake of a clear visual representation, the raw point clouds in the following figures are subsampled. Figure 10 demonstrates the flexibility and adaptability of the proposed inference machine. The approach is applied to every vertical face. Please note that for the indoor dataset we additionally compensate the deficiency of each segmented wall surface. This includes an automatic complement to hole areas, such as the upper and lower boundary of doors, by region growing from the original data. It ensures the robustness of the detection on arbitrary planar surfaces.
We also test our approach on more sophisticated point clouds. As shown in Figure 11, the approach can automatically detect opening layouts and model them in fine geometry. Figure 11 shows that the TUB2 point cloud has sufficient point density and covers the wall surface completely. The homogeneous point density of the walls in the rooms and corridors contributes to the opening reconstruction. We demonstrate three areas (b, c and d) in detail, marked in Figure 11a. The upper part of Figure 11b shows the original point cloud of the wall of the corridor and the lower part is the reconstruction result. The hole structure was incomplete due to missing ground points, resulting in an opening that was not detected. Figure 11c shows the result of the window being extracted correctly. There are other openings under each window, but they are occluded, and the boundaries are incomplete. These areas are either not being detected by our approach, or there will be misdetections resulting in small rectangles. These small detections can be filtered by reasonable thresholds, such as a size threshold and aspect ratio. Empirically, we set them to 0.35 and 3.5, respectively. The point cloud shown in Figure 11d is relatively simple, with only one door in a plane. Therefore, this opening is independent and not associated with other openings. However, occlusions and noisy points are still being a significant factor that influences the detection and reconstruction.

Quantitative Evaluation
To quantitatively evaluate the aligned openings, the following two methods are involved with manually generated benchmark datasets. We manually label openings on detected surfaces. To maintain the consistency of benchmark openings, we ensure the edge of openings that are coincident with the defined horizontal and vertical directions.
We evaluate the detection outcomes of the data sets, i.e., reconstructed openings, compared to manually labeled reference data with respect to the Completeness (Comp), Correctness (Corr) and F1 score [41,42], which are listed in Table 2. The metrics are shown in Equations (8)-(10): where the True Positive (TP) stands for a detected object that corresponds to an object in the reference.
An entity corresponding to the background in the reference is classified as a False Negative (FN). In addition, the False Positive (FP) describes the detection that does not correspond to an object in the reference [43]. . denotes the numbers of entities assigned to the above classes. We evaluate the accuracy of an opening by measuring the root-mean-square error (RMSE) of the distances between the corresponding four edges of the reconstructed opening and the reference model (as shown in Figure 12). Since the reconstructed opening is coplanar with the reference model and the corresponding edges are parallel to each other, we measure the vertical distance between the corresponding edges. To evaluate the overall accuracy D of the entire scene, we measure the RMSE of all reconstructed models. D is expressed as Equation (11). The evaluation is divided into two parts. We first evaluate the overall accuracy of the preliminary openings (known as D (bef.) in Table 3) and then evaluate that of the openings aligned through spatial relationships (known as D (aft.) in Table 3). The improvement rate is formulated as rate = 1 − D a f t /D be f . The MPS of the dataset, the number of openings (NoO) and the improvement rate are listed in Table 3.
where n is the total number of the reconstructed openings. i = {0, 1, ..., n} denotes the label of openings.   As shown in Table 2, we can precisely detect openings with distinctive layouts on 3D wall surfaces and regardless of the configuration. The proposed method obtains good results on the synthetic dataset, known as Syn. (a)-(d) in Tables 2 and 3, which are relatively simple cases. The dataset Syn. (b) has inhomogeneous point density and noisy points. They affect the initial opening detection, especially for those narrow openings on both sides (detailed limitations will be discussed in the following section). Therefore, the opening is likely to be reconstructed incorrectly with respect to the geometry. In the Syn. (b), four openings are erroneously detected over the total number of 62. Those unreasonable openings are blocked by a loose criterion. We set the minimal size of objects to 0.2 m 2 for the synthetic data set. Real data often contains occlusions, noise and inhomogeneous density of point cloud, which will affect the detection and reconstruction of openings. For the outdoor data KFH with a moderate level of clutter, 14 out of 16 openings are accurately extracted. Due to the sparse density of the point cloud and the influence of noise, two windows are not extracted correctly. Although the clutter level of the OKA data is high, 168 detections are consistent with the reference openings, and 7 are detected incorrectly. For the indoor dataset, TUB1 and TUB2 with relatively simple scenes record F1 scores higher than 0.85. Only one of the 31 openings of TUB1 is not detected. Moreover, 60 openings in the TUB2 data are correctly extracted, and there are 11 redundant detections affected by the sparse point density. Home furnishings in the UVigo (F1 score: 0.76) data block part of the glass windows, therefore 10 of the 30 doors and windows cannot be reconstructed correctly. The UoM data (F1 score: 0.7) also has noise (such as people and other occlusions). However, 8 of the 15 openings we reconstructed are consistent with the reference. The most complex indoor data is the Fire Bridge, known as FB. in Tables 2 and  3. A large amount of furniture and curtain walls increase the difficulty of opening extraction and reconstruction. We count all the openings (253 openings in total) in the scene including the first and second floors. There are 171 true positives, 28 false negatives and 16 false positives. Therefore, its F1 score is 0.89.
In Table 3, we analyze the overall accuracy of the aligned openings after the inference of the spatial relationship and compare it to that of the initial model. For the synthetic data, the accuracy of the aligned model is improved by over 7% relative to the accuracy of the initial openings. Among them, the dataset Syn. (c) records the highest improvement. With an average point spacing of 0.03 m, its overall accuracy is improved by 20% from 0.075 m to 0.060 m. However, the largest number of adjusted openings is recorded from the dataset Syn. (b). Since the number of reconstructed openings is relatively large, the accuracy will tend to be stable on average. Its improvement rate is 11%. Similarly, in the real data, the dataset OKA (204 openings in total) and the Fire Brigade (253 openings in total), with the high level of clutter, obtain 2% improvement of the overall accuracy. Although a small number of openings might have large adjustments (it is set to not exceed 0.5 m) after the determination of the spatial relationship, reasonable adjustments will not result in a large change in the overall accuracy with large sample size. The accuracy of the dataset OKA and the Fire Brigade are improved by 0.002 m and 0.003 m, respectively. While the number of samples is small, a few geometric model changes may be reflected in the overall accuracy. The accuracy of the dataset KFH is improved the most by 29% (from 0.062 m to 0.044 m). For the indoor data TUB1, we observe that the accuracy of the opening before and after the inference process is similar, both are 0.023 m, with an average point spacing of 0.005 m. It shows that our approach retains the original model structure after inferring the spatial relationship of openings when the original data quality is good with a simple scene. The mean point spacings of the TUB2, UVigo and UoM are similar, being 0.008 m, 0.010 m, and 0.007 m, respectively. Their overall accuracies are 0.021 m, 0.033 m and 0.030 m, which is 19%, 13% and 14% higher than their initial accuracies.

Generality
In the previous section, the experiments prove that our method based on flexible rules can accurately extract the openings. It can be used to extract openings with grid and non-grid layouts. Compared to existing approaches, it can better adapt to the general situation. We take the sliding window method [8] as an example. This is a typical method based on grid assumptions. It uses horizontal and vertical sliding windows to analyze the point density of the wall, and then divides the space and extracts the openings. This is fairly effective for openings arranged in rows and columns. However, for non-gridded layouts, this strict constraint will cause an over-segmentation. As shown in Figure 13a, the windows in the middle column are divided into small cells. These small cells need to be analyzed and fused to restore the original model. The method proposed in this paper directly extracts the independent openings based on the geometric information of the data, infers the spatial relationship of the openings, and generates a higher model accuracy. Therefore, our method adapts to more general situations than the state of the art.

Occlusions and Limitations
In this section, we discuss the impact of several different types of occlusions on the reconstruction results. In the real point cloud data, windows do not always maintain a clear hole structure. Decorations, such as curtains, will cause the presence of points on windows (see Figure 14). These points change the original geometry of the opening in the point cloud, which leads to an inaccurately reconstructed model. If a window is completely covered by points, it is hard to be detected. However, we find that the opening can be correctly reconstructed although there is a light occlusion and the border points are slightly missing (as shown in the lower left of Figure 14). Large furniture and other objects protruding from the wall often appear in the indoor space. Due to the occlusion by these objects, there will be a lack of points on the wall and ground surfaces. Our approach does not detect such areas, because the hole is incomplete on the detected plane. If there is a small object such as a cabinet on the wall, the corresponding hole with a complete border is likely to be detected as an opening. Similarly, when using MMS for the acquisition of outdoor data, objects such as building facades behind trees might be occluded (as shown in Figure 8). In such a void area, it is hard to detect openings. Solutions for opening restoration and simulation are not considered in this work. Figure 15 shows a section of the indoor dataset UVigo. We can see that the structure of the window behind the tree is complete, and the openings can be accurately reconstructed. It might be because the MMS acquires the points from multiple locations and complements the occlusion part during the stitching process. The point density affects the accuracy of the openings. As shown in Figure 16a, the point density at the intersection of the openings and the perpendicular wall is sparse which causes a deviation of the reconstructed openings from the reference model. As shown in Figure 16b, the sparse point cloud density between two adjacent openings affects the accuracy of the model. We believe that our method can flexibly and accurately reconstruct the opening when the component structure is complete. However, due to various types of occlusions, the danger of misdetections also exists. Figure 14. A complex example selected from the high clutter dataset (Fire Brigade). From the left to right, the sample of wall points; reconstructed openings (blue frames) with associations (red lines); the comparison to manually labeled openings (red frames).

Conclusions and Future Work
Openings such as windows and doors are essential components of building facades as well as interior walls, yet still being a challenge in reconstruction from point clouds. Starting with the simple idea of detecting holes as openings, we introduce a flexible approach for the reconstruction of openings of arbitrary size and location. Generally, openings are not isolated but within a global layout representing a design of a wall. We therefore propose a flexible inference machine that associates detections while updating the corresponding geometry with alignment rules. The structure of the layout of walls can be described by the connection of opening associations. It is used in the geometric adjustment of reconstructed openings that mitigate the problem of generality from strict rules. The overall accuracy of updated openings achieves 0.059 m on average. From our point of view, the idea of inference machine is adaptive to not only point clouds but also to images, both on indoor and outdoor scenes.
Although our focus is on using flexible rules for a global alignment of openings, the initial detection may suffer from several challenges. Occlusions, inhomogeneous point density due to the scanning range, spurious objects, outliers and noise appearing in the acquired data will influence the detection of openings. In consequence, the inference of association might be inadequate. Moreover, it also depends on the non-planarity of surfaces as we assume that the wall or facade in this work is planar. For a clear demonstration of the idea of our resilient inference machine, we currently exploit flexible rules based on straight line features in two directions. It shows a clear potential of adopting more general rules such as curve and diagonals towards a robust understanding of the structure and spatial relationship.
Our approach infers the spatial relationship between openings and refines the geometry accordingly. First, this pipeline starts from opening detection and reconstruction from 3D point clouds. We demonstrate that it is not only adaptable to building facades but also to walls in interior spaces. We envision that our approach can be exploited in the fast updating of large-scale city models with detailed geometric elements on wall surfaces for various applications in the scheme of smart cities, GeoBIM and digital twins. For indoor modelling, this offers enriched and precise information for scene analysis. Next, the flexible inference machine associates the openings and rebuilds the relations between groups. In our perspective, the group information of openings could be used for procedural modelling and interactive editing. Moreover, this could be used for an analysis of exterior and interior wall surface that contributes to investigating a building's fundamental energy performance. Finally, there will be a significant upgrade to the current approach if the updated geometry of objects by inferences being used to iteratively update the α value and alignment rules until an optimum is achieved.