RegARD: Symmetry-Based Coarse Registration of Smartphone’s Colorful Point Clouds with CAD Drawings for Low-Cost Digital Twin Buildings

Coarse registration of 3D point clouds plays an indispensable role for parametric, semantically rich, and realistic digital twin buildings (DTBs) in the practice of GIScience, manufacturing, robotics, architecture, engineering, and construction. However, the existing methods have prominently been challenged by (i) the high cost of data collection for numerous existing buildings and (ii) the computational complexity from self-similar layout patterns. This paper studies the registration of two low-cost data sets, i.e., colorful 3D point clouds captured by smartphones and 2D CAD drawings, for resolving the first challenge. We propose a novel method named ‘Registration based on Architectural Reflection Detection’ (RegARD) for transforming the self-symmetries in the second challenge from a barrier of coarse registration to a facilitator. First, RegARD detects the innate architectural reflection symmetries to constrain the rotations and reduce degrees of freedom. Then, a nonlinear optimization formulation together with advanced optimization algorithms can overcome the second challenge. As a result, high-quality coarse registration and subsequent low-cost DTBs can be created with semantic components and realistic appearances. Experiments showed that the proposed method outperformed existing methods considerably in both effectiveness and efficiency, i.e., 49.88% less error and 73.13% less time, on average. The RegARD presented in this paper first contributes to coarse registration theories and exploitation of symmetries and textures in 3D point clouds and 2D CAD drawings. For practitioners in the industries, RegARD offers a new automatic solution to utilize ubiquitous smartphone sensors for massive low-cost DTBs.


Introduction
Digital twin building (DTBs), as real-time 'as-is' 3D building models, have attracted great attention of both industry and academy due to the promised applications in the GIScience [1], manufacturing , robotics, mapping, architecture, engineering, construction, and operation (AECO) industries [2][3][4][5], and heritage documentation [6,7]. A DTB is a virtual representation of a physical building "across its lifecycle, using real-time data to enable understanding, learning, and reasoning" [8,9]. Three long-standing requirements on the roadmap of DTB, i.e., parametric geometry, rich semantics, and realistic appearances, are thus indispensable for fulfilling the functions of "understanding, learning, and reasoning" [10][11][12][13]. For example, in the design and operation phases, digital models with parametric geometry, rich semantics, and realistic appearances bring practitioners a more comprehensive and thorough understanding of the built environment, as well as solid data support for automation and analytics [4,8].
The coarse registration problem is typically solved by finding the optimal transformation, i.e., rotation, translation, and scaling, which is often with high degrees of freedom (DoFs). Many existing coarse registration methods thus rely on unique local features-based initial alignments close to the global minimum to escape from the local optima in such high-DoFs problems [20]. However, the self-similar layout patterns in building interiors make it problematic for the conventional method to find promising initial alignments [15,21]. In summary, the high-level DoFs and self-similarities of indoor layouts challenge the local feature-based 'shortcuts' in conventional registration methods. Inspired by symmetric cross-sections as robust global-rather than local-features [5], we propose a novel coarse registration method, named Registration based on Architectural Reflection Detection (RegARD), that transforms the self-symmetries in the challenge from a barrier to a facilitator. The RegARD first detects the symmetry axes in point clouds and CAD drawings to reduce the DoFs by constraining the rotation in parallel or perpendicular to the axes. Then, RegARD employs advanced nonlinear optimization algorithms, such as CMAES [22], DIRECT [23], and Nelder-Mead [24], to obtain the optimum registration. In other words, RegARD segments the high-DoF optimization of rotation, translation, and scaling into a lower-DoF subproblem on rotation and another lower-DoF subproblem on translation and scaling. As a result, both the efficiency and accuracy of coarse registration can be improved notably. Based on RegARD's results, DTBs can be generated in the Industry Foundation Classes (IFC) format automatically with realistic textures mapped from point clouds. Overall, RegARD can facilitate the creation of DTBs in several aspects: parametric geometry, rich semantics, and realistic appearances, and processing time.
The remainder of this paper is organized as follows. Section 2 summarizes the point cloud registration methods and the point cloud processing with architectural regularities. Section 3 presents the details of RegARD. Section 4 reports the experimental registration results of our large-scale dataset of seven stories and present the generated IFC models as resulted DTB. We then discuss our digital twinning solution and RegARD method in Section 5 and conclude the study in Section 6.

Point Cloud Registration
Pairwise point set registration is a fundamental yet challenging task in geometry processing and digital twinning. The registration problem is often decomposed into two sub-problems: (1) feature detection and (2) correspondence and transformation estimation. In the literature, many related studies focusing on the second sub-problem are termed as 'fine' registration. A fine registration assumes high-quality initial correspondences with correct and accurate features. Thus, the main task of fine registration is improving the initial transformation in a close form with given correspondences. However, the optimal correspondences are finally determined by the optimal transformation-there exists a 'chicken-egg' dilemma between determining high-quality correspondences and optimal transformation. Therefore, 'coarse' registration emerged for resolving the dilemma by solving both sub-problems.
For resolving the first sub-problem of coarse registration, hand-crafted features with explicit meanings of point sets were proposed first to detect and describe key points for correspondences establishment. Examples of explicit features are Spin Image [25], Point Signature [26], and FPFH [27]. However, it is extremely difficult and burdensome to craft robust and comprehensive local features for the wide diversity and deficiency of points [28]. Recently, implicit features have been exploited for registration extensively using deep neural networks [28][29][30][31]. For example, FCGF integrates a fully convolutional network, 3D sparse representation, and contrastive loss, and achieved higher feature detection compared to hand-crafted counterparts [29].
For resolving the second sub-problem, the Iterative Closest Point (ICP) [32,33] is a classical mechanism. Yet, ICP can be easily stuck by local minima without high-quality initial alignments [34]. Variants of ICP were thus proposed for escaping from the local minim; examples are CPD [35], Go-ICP [36], and GMMTree [37]. Besides, researchers also formulated the whole registration problem into end-to-end formulations, e.g., supervised learning pipelines and nonlinear mathematical programming [38]. Representative studies include the deep learning versions of ICP, of which one is coined as Deep Iterative Point [39] and another is Deep Global Registration [40].
However, the coarse registration of the point sets is not fully addressed for DTB, especially in the complex and repetitive indoor settings. There are two possible reasons for the limited effectiveness. First, the common self-similarity in multiple scales of building interiors hinders traditional feature detection [15]. Secondly, building interiors' 2D and 3D manifolds without complex inner structures also lead to and creates massive local optima "traps" for correspondences and transformation optimization.

Point Cloud Processing with Architectural Regularities
Researchers in remote sensing, construction, and computer vision have utilized architectural regularities in processing point clouds for modeling 3D buildings. For example, the vertical and horizontal planes in urban and indoor spaces can be useful a priori primitives for registration and reconstruction. Xu et al. [41] and Bueno et al. [15] introduced a feature descriptor, i.e., four-plane congruent set (4PNCS), for urban and building scenarios, respectively; 4PNCS significantly reduced the number of correspondences and improved the matching efficiency. Similarly, plane sets representing the main structures of the built environment are also applied in determining the structure-level correspondences robustly and efficiently [42]. Besides, Zolanvari et al. [43] proposed an improved slicking model (ISM) to segment facades based on some geometric characteristics of planes. Polewski and Yao [44] used the intersections of adjacent planar surfaces as line correspondences and the roof symmetry axes to co-register multimodal data such as LIDAR point clouds and digital surface models. Chen et al. [45] regularized rooftop elements from LIDAR points onto 2.5D block-building models and validated them in a high-density high-rise area.
Moreover, repetition and similarity are also prevalent in heuristic rules for urban and building reconstruction. For example, Wang et al. [46] incorporated the local symmetries into a nonlinear least-squares optimization for reconstructing the contours of building roofs. Ceylan et al. [47] recognized the windows on facades by the repetitions in photogrammetric point clouds. Cheng et al. [48] applied rotational symmetry and slicing to register buildings such as towers. Although the idea is very inspiring and similar to RegARD, the targeted data of [48] is building exteriors with rotational symmetry, while ours are building interiors with reflection symmetry and different self-similarities. For detecting the reflection symmetry, Xue et al. [49] integrated state-of-the-art derivative-free optimization (DFO) algorithms from applied mathematics and computer science for detecting the architectural reflection effectively and efficiently. Xue et al. [5,50] demonstrated that the reflection detection of points successfully improved indoor DTB and cross-section features for clustering unknown objects in LIDAR clouds for a digital twin city. Therefore, it will be intriguing to apply the architectural reflection symmetry to the challenging registration problem of large-scale as-built 3D scans with as-designed 2D drawings.

Digital Twinning of Building Interiors
LIDAR and photogrammetric point clouds have been widely used for digital twins of building interiors. Three groups of interior components received the most attention: (i) the physical building components and structures, e.g., walls and slabs, (ii) indoor spaces and partitions, e.g., stories and rooms, and (iii) indoor topology. For the physical components, plane detection and curved surface segmentation usually precede the digital twinning for candidate surfaces [51][52][53]. For the volumes, Bassier and Vergauwen [51] estimated walls' axes in different shapes; Nikoohemat et al. [53] designed heuristic rules to label plane segments based on the adjacency graph; Ochmann et al. [54] processed point clouds into 3D cell complexes and labeled them, as rooms and structures, using integer programming; Wang et al. [52] generated a semantic wireframe of permanent structures based on the boundaries of classified planes. For spaces and partitions, stores are usually extracted by z-coordinates clustering of the whole point clouds or classified planes [51,55]. Moreover, Ochmann et al. [54] cast rays onto the wall surfaces from unoccupied locations for detecting wall loops and rooms; Murali et al. [56] detected room cuboids from a wall graph; Jung et al. [55] projected a story's point clouds to a 2D plane and segmented the rooms' footprints.
For indoor topology, Bassier and Vergauwen [51] reconstructed four types of connections by heuristic rules; Wang et al. [52] regularized the geometry of planar boundaries and repaired incorrect connections by a conditional Generative Adversarial Network.
Apart from the DTBs using high-cost LIDAR systems, floor plans can serve as an affordable data source to an alternative solution. For example, vectorized floor plans contain semantic as-designed indoor layouts, which can be extracted via computer vision and optimization methods [57,58]. The as-designed indoor layouts can facilitate processing scanned point clouds. Wijmans and Furukawa [59] presented a Markov Random Field inference formulation for the scan placements problem over a given floor plan image; the formulation can guide the registration of multiple indoor scan fragments with as-designed data. In summary, it is promising to triangulate the mainstream as-built point clouds with as-designed floor plans in terms of reducing costs and improving the correctness and accuracy of the digital twinning procedures. Figure 2 shows a flowchart of our automatic DTB creation pipeline. The inputs included a point cloud captured by a smartphone and the corresponding 2D CAD drawing of the floor plan. This pipeline output the registered point clouds with the 3D semantic models generated from CAD drawings. Finally, we could create DTB with textures mapped from the registered point clouds.

Preprocess ( §3.2)
Extraction of vertical structures Story segmentation Step 1: Step 2:  From the inputs to the outputs, there were two processing stages. The first stage (Section 3.2) of this pipeline was the preprocessing of the inputs: separating the point clouds into stories, extracting the semantic and vertical structures from CAD drawings with a method named Plan2Polygon, and finally sampling the 2D points from the processed point clouds and CAD drawings of each story respectively. The second stage (Section 3.3) was the crucial registration, i.e., RegARD. The point cloud of each story here was the source geometry, while its paired CAD drawing was the target geometry. The registration transformed the source to match the target geometry. The first step of RegARD was the Architectural Reflection Detection (ARD) part. The reflection symmetry axes of sampled 2D points from the point clouds and CAD drawings were detected. Next, an initial transformation was applied to align the symmetry axes of the paired point cloud and CAD drawing of the same story. By doing so, RegARD constrained the rotation in a set {kπ/2 + θ ard |k = 0, 1, 2, 3} of four values down from [0, 2π), to reduce one DoF due to 4 |[0, 2π)| = ℵ 0 , where θ ard denotes the result of ARD and | · | indicates set cardinality. Next, RegARD solves the remaining four DoFs, i.e., translation and scaling along the x and y axes, by DFO algorithms and outputs the final transformation parameters to register the story point cloud with its corresponding CAD drawing. Besides, we also briefly introduce the process to generated textured DTB (Section 3.4) after the registration.

Preprocessing
This stage extracted and sampled the vertical structures from CAD drawings for the coarse registration. The structure polygons and openings were parsed out from the drawings, while stories were separated from the original point clouds. Finally, both the vertical structures from drawings and point clouds were sampled with a range of proper densities for registration. Note that the 3D scanning and the 2D CAD drawings were sampled into 2D points to reduce the computational cost.

Parsing Cad Drawings (Plan2polygon)
The preprocessing employed a new in-house developed schema, called Plan2Polygon, for parsing CAD drawings: using polygon-based rules to extract vertical structures, i.e., walls, pillars, and windows. A preliminary pipeline of the preprocessing schema consisted of the following three parts: (i) CAD filtering, (ii) structure polygonization, and (iii) openings simplification.
First, unnecessary elements and semantics were removed from CAD drawings ( Figure 3a). Annotations of names, dimensions, and furniture, isolated doors, and other unnecessary elements could be quickly filtered out using CAD software.
Then, the building interior was polygonized based on the structure lines. We first triangulated the planes with the structure lines as constraint segments ( Figure 3b). These triangles were merged into polygons by a region-growing process [60]: pick one triangle as a seed and expand its neighborhood triangles until meeting the structure lines ( Figure 3c). Next, the polygons could be further classified into objects, such as walls, windows, and staircases. This study focused on the vertical structures, especially walls, for the following texturing process. Therefore, the approach extracted these 'thin' vertical structures by a thickness index: a(g) is the area function of a given polygon g whose external boundary is constructed by an ordered list of n points (p 1 , p 2 , . . . , p n ). Note that g may contain holes, i.e., internal boundaries that are also constructed by ordered lists of points. b(g, t) is a buffer function which offsets the boundary (or boundaries) of a given polygon g to its interior by a distance of t. Polygons processed here are assumed to be not self-intersected. The longer and narrower the polygon is, the smaller the I thickness is. Therefore, we could extract the polygons with an I thickness less than a given threshold as vertical structures (Figure 3d).
The values of t and the threshold of I thickness in our experiments will be introduced in Section 4.2. This filter was simple yet effective to extract the vertical structures of our test data with very few square wall parts nor columns. Furthermore, one could easily extend the Plan2Polygon with more advanced classifiers to recognize square wall parts and columns. Openings, such as doors in this study, were always drawn as curves and/or multiple segments in CAD drawings. We simplified them into 2D segments for creating their 3D blocks in the following stages. This simplification was adopted from [58]. Convex hulls were generated for each door symbol and intersect with its adjacent vertical structure polygons. The shortest path connecting two intersections was the final simplified segment for each door symbol ( Figure 3).

Segmentation of Point Clouds by Stories
The preprocessing also extracted each story in the as-built point cloud P, using a similar process as [61]. We first calculated the distribution histogram of the heights, i.e., the z-components of all points in P, as shown in Figure 4. We set the bin interval of the histogram as 0.3 m according to our experiments. Next, the peaks of the histogram were detected as floors and ceilings. To remove some noisy peaks caused by facilities or staircases, we also set two height-difference thresholds of continuous peaks, i.e., the wall height of a story and the height between a ceiling and the floor of the upper story.
Consequently, the filtered peaks were in a repeated 'floor-ceiling' pattern. Note that this process was based on the assumption that the z-components of points only concentrated on the floors and ceilings, which would be problematic when there were stepped floors and other large z-components clusters in the middle of the stories. However, this assumption holds in many standard buildings, including the test data in this study.

Sampling 2D Points
Finally, we sampled points from the footprints of the parsed structures of CAD drawings and the story-separated point clouds. To sample CAD structures, we extracted points along the boundaries of structure polygons or opening lines. To sample point clouds, we removed the floors and ceilings of the story-separated point clouds, projected the middle part of a story onto a 2D horizontal plane, then randomly sampled the projected points.

The Proposed Regard Method
The pseudo-codes of the proposed RegARD are described in Figure 2. The RegARD solved the coarse registration in two manageable steps. The first step of RegARD was the architectural reflection detection (ARD) of both the source and destination point clouds, e.g., C and D, as shown in Figure 2. If both symmetry axes were found, C and D were aligned to the axes to constrain the searching space of the transformation as a four-DoF problem. Otherwise, the RegARD received the coarse registration as a five-DoF problem, which had an extra DoF of the relative rotation. The second step of RegARD optimized the transformation iteratively for the four-/five-DoF problem.

Symmetry as a Global Feature
Registering point clouds to CAD drawings was challenging due to the omnipresent self-similarity of building interiors and computation complexity to solve the transformation. Figure 5 shows an example of the Root-Mean-Square Distance (RMSD) curve of registration with different rotations, while the translation and scaling DoFs were 'frozen' at the fittest value.
The curve was also 'rugged.' In other words, many local minima of rotations could trap the conventional methods. However, the rugged RMSD curve in Figure 5 also inspired in us the idea of using the architectural reflection symmetry as a global feature. That is, the minimal RMSD was associated with the optimum rotation that moves the reflection symmetry axes of the source point cloud to the destination. Therefore, RegARD detected the reflection symmetry axes of both the source and target geometry. By doing so, the fittest rotation could be almost solved, and there are only four DoFs left for the optimization to solve, which significantly eased the registration optimization. However, the rugged RMSD curve in Fig. 6 also has inspired us the idea of RegARD: the minimal RMSD is located at the rotation when the reflection symmetry axes of the target and source geometry are aligned. Therefore, RegARD is designed to first detect the reflection symmetry axes of both the source and target geometry. By doing so, the fittest rotation can be almost solved and there are only 4 DoFs left for the optimization to solve, which significantly ease the registration optimization. To demonstrate the RegARD method, the definitions of symmetry detection, as well as the transformation parameters, are introduced in Section III-C1, while the algorithm is presented in Section III-C2.
1) Definitions: a) Symmetry detection: Formally, a subset S of ndimensional Euclidean space En is symmetric with respect to a transformation T if T (S) = S [48]. Moreover, since the symmetry of a building may not be rigorous in every detail, the transformation T can be determined by approximate descriptors. For example, to determine the symmetry transformation T from a 2D point set, we can maximize the point correspondence rate (PCR) and RMSD, which are defined as follows [44], [49]:  [44]. After ARD, a parameter set init can be created int the two symmetry axes' centers and Line 4.
In the second step (Line 6), since symmetry axes of the source and tar parallel or perpendicular to each other, set {0, π/2, π, 3π/2} can be the plau the geometry, the source C after initial D. Therefore, the second step iterativ transformation parameters sx, sy, tx, a the best rotation in {0, π/2, π, 3π/2}.
To explain RegARD in more de paragraphs demonstrate the two cruci tectural reflection detection and trans

DoFs in Regard
Different types of symmetry have different parameterization [49,62]. This study focuses on the detection of reflection symmetry, one of the most fundamental layout patterns in architecture for aesthetic and practical reasons. For a point cloud C, the reflection symmetry axis s can be parameterized by (r s , θ s ), r s ∈ R + ∪ {0}, θ s ∈ [0, 2π), as shown in Figure 2, where r s is the distance from the origin O(0, 0) to the symmetry axis, θ s is the angle from the positive x-axis direction to the line perpendicular to s through O. The polar function of s can be writen as r = r s sec (θ − θ s ), where (r, θ) is the polar coordinates of points on s. The polar coordinate (r, θ) can also be converted into a Cartesian coordinate as (r cos θ, r sin θ). Based on this parameterization, the reflection symmetry transformation p and p here are the 3D homogeneous coordinates, i.e., (x , y , 1) and (x, y, 1) respectively, of their 2D counterparts.
To register the sampled point cloud to the 2D CAD drawing, the transformation T reg has five DoFs, i.e., the rotation angle θ reg , 2D scaling factors s, and 2D translation offsets t, applied on a 2D point p = T reg p is parameterized as: Similar to Equation (2), p and p are 3D homogeneous coordinates in p = T reg p. We used individual scaling factors for x-y axes because different drifting rates along the different axes of 3D scanning by SLAM causeddifferent scaling factors along x-y axes of the registration. Yet, if the rotation θ reg was constrained as one (or four) constant, T reg became a four-DoF problem.

Step 1: Two-DoF Architectural Reflection Detection
Formally, a subset S of n-dimensional Euclidean space is called symmetric with respect to a transformation T, if T(S) = S [62]. Moreover, since the symmetry of a building may not be rigorous in every detail, the transformation T can be determined by approximate descriptors. For example, to determine the symmetry transformation T from a point cloud, we can maximize the point correspondence rate (PCR) and RMSD, which are defined as follows [49,63]: where C = {p 1 , p 2 , . . . , p n } is a cloud of n points; N(p, C) ∈ C denotes the nearest point of p; ||p i − p j || is the Euclidean distance between two points p i and p j ; d(p, T, C) is the distance of p to C after transformation T; t c is a default distance threshold of correspondence [49]; and |C| is the cardinality, i.e., the number of points, of C. Both PCR and RMSD measure the degree of shape preservation after transformation T. The difference is that PCR counts the preserved points and RMSD indicates the distance errors.
In this study, we defined the symmetry detection function as in the ODAS [49] to: The problem to optimize in Equaiton (7) had two DoFs. The ARD determined the reflection symmetry axis of a point cloud C, the reflection transformation based on this axis should maximize the PCR and minimize the RMSD as defined in Section 3.3.3. Therefore, the ODAS minimized the following objective, i.e., Equation (7), to solve r s and θ s . Note that the RMSD was subdivided by the diagonal length of C, denoted as diag C , to avoid the scale impact of the point cloud.

Step 2: Four-DoF Transformation Optimization
The optimization problem of the coarse registration was thus simplified to: arg min θ reg ,s x ,s y ,t x ,t y RMSD(T reg , C , D), where C is the source geometry after the alignment of the reflection axes. θ reg = θ ard + ∆θ, where ∆θ ∈ {0, π/2, π, 3π/2}. Moreover, s x and s y are bounded in [1/b s , b s ]. It is clear that Equation (8) is equivalent to a four-DoF problem. Note that the registration of RegARD in Equation (8) is a partial-to-full matching. Therefore, most points in C can find their correspondences in D and the correspondences are the nearest points when the transformation is close to the global minimum. Thus, it is not necessary to build a correspondence set filtered out non-overlapping geometry in this study. The RegARD employs a long list of up-to-date DFO solvers for computing the optimum transformation in Equation (8) iteratively. Notably, the algorithms benchmarked in this study included CMAES [22], DIRECT [23], MLSL, MMA, COBYLA, Nelder-Mead [24], SBPLX [64], AUGLAG, and BOBYQA.

Texturing for Digital Twin Buildings
After the registration, a post-processing cropped the point cloud and creates texture images for every major building element in the CAD, to form the final textured DTB. The workflow is presented in Figure 6. The parsed 2D footprint polygons of the indoor structures were vertically extruded into 3D blocks. The extrusion length was the height of the corresponding story point cloud. Moreover, the blocks were organized in boundary representation schema, so that the faces of blocks could be textured separately.
To texture a face, we buffered it into a rectangle along the face normal in both the positive and negative directions, called face box here. This box was used to crop the corresponding 3D region in the registered point cloud. However, repeatedly cropping thousands of relatively small boxes from the whole point cloud with almost 10 million points was very time-consuming. Therefore, to accelerate the cropping speed, the point cloud was first cropped into caching slices that covered the coplanar faces or parallel faces with small distances to each other. Next, the corresponding 3D point cloud region of a face was cropped by the face box from its corresponding caching slice. These caching slices shrank the extent of the point cloud to crop from. Proven by our experiments, the slicing reduced about 80% cropping time. The points cropped out were then projected onto the face, i.e., converting their 3D coordinates (x, y, z) into the 2D plane coordinates (u, v). The u-v coordinates were then discretized into the column and row indices on the texture image of this face. Then the corresponding image pixels were colorized with the projected point colors. Since not all image pixels could be projected by points, there were 'holes' on this image. Therefore, we applied the inpainting [65] and default colorizing to generate the final complete texture images. Once generated, the texture images could be mapped onto the face by aligning the image corners to the face vertices.

Test Data
The proposed approach was tested on the point clouds and CAD drawings of seven stories, from the second to eighth floor, in the Knowles Building at the Main Campus of The University of Hong Kong. The target building had considerably different layouts from the second to the eighth floor, as listed in Table 1 and Figure 7. Each storey's area was >2000 m 2 , yet every storey had different dimensions and topologies for indoor spaces and networks.
The boundary of F5 was considerably larger than others because of a footbridge connecting a nearby building. Thus, the CAD drawing's center x-coordinate was 12.238 m away from the reflection symmetry axis. Besides, the y-axis of the F3 drawing was perpendicular to its symmetry axis, which created a large rotation compared with the corresponding point clouds.
The colorful point clouds were scanned by a Google Tango AR phone (model: Lenovo Phab 2), which is reproducible by current mainstream ARCore phones or AR Kit phones (e.g., iPhone X and above and iPad Pro) [66]. The geometric error was less than 5 cm initially, but gradually 'drifted' over the scanning course. Except for F5, the as-built scans covered the public areas (e.g., corridors and stair networks) located in the center of the CAD drawings' spatial bounding boxes. Note that the corridors scans can be incomplete for some stories.
We highlight the experimental results on F3 and F5, which would validate the Re-gARD method with a large rotation offset and central differences, respectively. All seven storeys were used for benchmarking the average performances in terms of accuracy and computational time.

Implementation Details
In the preprocessing, the buffering distance t in Equation (1) was set as 0.01 m, while the threshold of I thickness was set as 0.95. The two parameters of ODAS (i.e., t c in Equation (5) and the depth of the octree) were set to default values, while the solver was the default DIRECT, all as suggested in [49]. The default bounding parameter b s is set as 1.

Generated Digital Twin Buildings
The resulted DTBs were created in the Industry Foundation Classes (IFC) format. As introduced in Section 3.2.1, we extracted the vertical structures, i.e., walls and openings, from CAD drawings and generated them into IfcWall and IfcDoor instances, along with the generated floors and ceilings as IfcSlab instances. The elevations and heights of IfcWall, IfcDoor, and IfcSlab were set according to the floor and ceiling elevations estimated as Section 3.2.2. Next, the RegARD method registered the scan point clouds to the design models, following by the texturing process. To enable the instance texturing in IFC, the geometry of instances is stored as IfcFacetedBrep with faces as IfcFace. Therefore, the external texture images could be linked to IfcFace by IfcImageTexture, IfcSurfaceStyleWithTextures, IfcSurfaceStyle, and IfcStyledItem; the coordinate mappings between the texture images and the IfcFace were given as IfcTextureMap and IfcTextureVertex.
As shown in Figure 8 (visualized in FZKViewer), those vertical and horizontal surfaces scanned in the color point cloud were successfully textured, with an average density of 1403 points/m 2 . Coarse appearance was attached to the walls, doors, floors, and ceilings. The corresponding IFC instances could link the real material color as well as the pasted posters through the texture images. By this tight integration between the as-designed CAD models and the as-built point clouds, our approach enabled the digital twinning of buildings with geometry, semantics, and appearance simultaneously at a much more affordable cost.

Registration Quantitative Analysis
The coarse registration was the core problem to resolve. Hence, we evaluated and compared the registration RMSD of RegARD with other state-of-the-art coarse registration methods, including CPD [35], Go-ICP [36], and GMMTree [37]. To further demonstrate the improvement by ARD, we compared the registration quality and efficiency with and without ARD. Besides, the DFO solvers are also a crucial factor for the final registration efficiency; we therefore tested the convergence of different DFO solvers to conclude the appropriate solver for our registration. Table 2 reports the benchmarking results of RegARD and other registration algorithms. To control the total processing time, we set the number of iterations for all the algorithms as 100, the sampling interval of CAD drawings as 10 cm, and the sampling rate of point clouds as 0.001. The DFO solver of RegARD applied in this comparison was Nelder-Mead. Besides, the computational time of RegARD included the time of ARD, and that of Go-ICP includes the time for building distance transformation structures. Moreover, the outlier ratio of the source point cloud over the target was required by the CPD algorithm. They were set as 1 − Ovl p, which are given in Table 1. a : CPD and GMMTree were from Tanaka et al. [69], while Go-ICP from [70]. b : The %imp is the improvement over the best of the baseline results. The lowest RMSDs and shortest running time of each story are highlighted in bold.

Registration Benchmarking
As shown in Table 2 and Figure 9, our algorithm showed significant advances in both registration accuracy and computational time above all the best baseline methods. RegARD reduced the RMSD by a considerable amount ranging from 24.87% to 60.17% compared with the baseline methods. Figure 9 shows the visual results of F2, F3, and F5. The results of CPD, Go-ICP, and GMMTree yielded obvious rotation or translation errors, while RegARD successfully registered as-designed and as-built data. Meanwhile, the computational time of RegARD on all the tested stories was shortened by 36.02 % to 83.06% compared with the most efficient baseline, i.e., GMMTree. On average, RegARD's RMSD was 49.9% less than the best results of the baselines, while RegARD also saved 73.1% time cost. The RegARD's results in Table 2 were very close to the global minima. By the decomposition as two sub-problems, the RegARD outperformed with only 100 iterations. Meanwhile, other coarse registration methods such as Go-ICP and GMMTree should search in a large solution space to approach the global minima gradually. Besides, since the asymptotic time complexity of RegARD was proportional to the number of source points, it took a longer computational time on F2, the point number of which was greater than others, as shown in Table 1.

Regard Component Analysis
We further measured the ARD's effects by comparing the registration with and without ARD solved by DFO. The sampling interval of CAD drawings was 10 cm and the sampling rate of the point clouds was 0.001. The DFO solver of RegARD was Nelder-Mead. The resulted metrics are reported in Table 3. The visual comparisons on F3 and F5 are presented in Figure 10. They can be summarized into three situations based on the metrics and the initial poses of point clouds and CAD models: The inputs were close to the fittest transformation, e.g, the stories except for F3 and F5. The transformation between the sampled points of drawings and point clouds could be solved in a limited number of iterations. The results with 100 iterations were close to the converged solutions.

2.
There was a large translation between the initial poses of the two inputs, e.g., F5. The DFO solvers could optimize these translations to optima or sub-optima, with or without ARD. This could be verified by the F5's RMSDs in Table 3, the visual results in Figure 10, and the RMSD curves in Figure 11.

3.
There was a large rotation between the initial poses of the two inputs, e.g., F3. This was the most challenging situation. As shown in Table 3, the registration without ARD was recorded an RMSD that was 4.5 times RegARD's. The corresponding visual results of F3 and the RMSD convergence curves are presented in Figure 10 and Figure 11a, respectively. The curve comparison in Figure 11a demonstrated a considerably faster convergence with ARD. This result proved the argument in Section 3.3: rotation was a crucial DoF that could trap optimization algorithms in the problem equipping with strong self-similarities (e.g., building interiors). By decomposing the optimization of rotation and other DoFs, RegARD enabled the problem to be solved around 100 iterations.
Moreover, to select appropriate DFO solvers for building interior registration, we compared different DFO solvers, as shown in Figure 11b-e. The sampling interval of drawings was 1 cm, while the point clouds' sampling rate was 0.001. All results were the average values of 10 independent runs. SBPLX, BOBYQA, and Nelder-Mead showed a high degree of efficiency and robustness when ARD was applied. Besides, both DIRECT and CMAES converged to find the best objective value if the number k of iterations was large enough. We also noticed that DIRECT and CMAES performed well at solving problems with many local minima, e.g., the F3 case with a large rotation and the F5 case with a large translation. Besides, MMA and MLSL were inappropriate or unstable regardless of ARD. Therefore, SBPLX, BOBYQA, and Nelder-Mead were the appropriate choices for RegARD with architectural reflection symmetries, while DIRECT and CMAES were more robust for registration without architectural reflection detection. Table 3. Metrics of registration with and without ARD (100 iterations, average of 10 runs).

Discussion
In this study, we present a digital twinning approach based on low-cost data sources: point clouds captured by smartphones and available 2D CAD drawings for existing buildings. The critical part of our approach is the RegARD method to register point clouds with drawings. The theoretical breakthrough of RegARD is using the architectural reflection symmetry to successfully isolate the optimization rotation from the DoFs. As a result, the challenging registration of noisy and self-similar indoor point clouds becomes solvable through a two-step process. Moreover, together with the preprocessing and the 3D generation and texturing, our approach can generate DTBs with parametric geometry, rich semantics, and realistic appearance in a short processing time.
However, there are still some limitations and possible improvement directions in our approach, including:

1.
Registration quality: RegARD is a rigid registration method aiming at applying a global transformation to align indoor point clouds and CAD drawings. However, there could be local misalignment as well as translation and rotation drifts which cannot be robustly registered with only one global rigid transformation. The right top of F3 and the left top of F5 shown in Figure 10 are examples. To resolve this issue, the rigid alignment with the non-rigid corrections or piece-wise rigid registration [71] can be applied. Moreover, as-designed data, such as floor plans, can serve as a priori information to make proper assumptions on the deformations and guide the piece-wise segmentation of point clouds. For example, an indoor point cloud can be segmented into rooms and represented as a graph. Then, rigid transformations can be estimated on the nodes and edges to counteract the local misalignment or drifts.

2.
Semantics richness: in Section 3.2.1, this paper applies a thickness filter to extraction of wall instances from the CAD drawings. The filter has a limited capability in extracting vertical structures with square cross sections, though. Moreover, the vertical structures could be further classified, e.g., as external walls, inner walls, windows, and sliding doors. One possible way is to replace the thickness filter in this paper with supervisedlearning-based classifiers, such as Decision Tree and Support-Vector Machine. Besides, it is also possible to perform object detection and semantic/instance segmentation on point clouds to attach more detailed semantics to IFC elements. 3.
Appearance quality: as shown in Figure 12, the resolution of the texture images is not high enough and defects such as blurring exist. This is a result of several reasons, such as the limitations of the scanning sensors, embedded Simultaneously Localization And Mapping (SLAM) algorithms, and unavoidable dynamic objects and texture lacking in the scanned environment. This issue can be improved by using the recent and even the next generations of consumer-level scanning devices with advanced sensors or embedded SLAM algorithms for point cloud collection. 4.
Processing time: the speed of the whole pipeline can be improved. For example, the asymptotic time complexity of RegARD is proportional to the number of source points, meaning the processing time can grow fast when the point number grows. This issue can be mitigated by applying weighted sampling [49] to reduce the processing scale of point clouds.

5.
Availability of reflection symmetry: when a building is asymmetric or with other types of symmetry, e.g., rotation or translation, rather than reflection, we can directly optimize the transformation without reflection detection. Examples without reflection detection are given in Table 3 and Figure 10. Moreover, because there is less self-similarity of asymmetric buildings, there are fewer local minima to trap the optimization. 6.
Inconsistency detection: there could be inconsistencies between the as-built and asdesigned data. For example, the two red circles in Figure 12a show the on-going temporary construction work on the F2 of Knowles building. The temporary work covered one pathway between two soundproof curtains. These consistencies can cause a larger RMSD in registration or texturing noise. Inconsistency detection should be further exploited to improve the registration and final realistic models. At the same time, it is also desirable for maintenance and renovations.

Conclusions
Digital twin buildings (DTBs) are increasingly demanded by GIScience, manufacturing, robotics, mapping, and AECO industries. However, creating DTBs with parametric geometry, rich semantics, and realistic appearances with limited labor, device, and time cost is very challenging. This paper proposes a prototype method named 'Registration based on Architectural Reflection Detection' (RegARD) for high-quality and low-cost DTBs. Our approach exploits two low-cost data sources: 3D point clouds captured by ubiquitous mobile devices and widely available 2D drawings of existing buildings. Pilot experiments showed the RegARD can register smartphones' point clouds with CAD drawings in high quality and efficiency. Based on the results of the RegARD, DTBs can be automatically generated with realistic textures mapped from point clouds as well as parametric geometry and rich semantics parsed from drawings.
Multiple directions of future work can be depicted from the findings and limitations of this paper. First, since the first coarse registration towards digital twin building creation is solved by the RegARD method, researchers can explore the seamless integration of indoor texture, objects, and topologies in the as-built 3D scans and the building structures and systems in the as-designed 2D drawings and 3D extrusions. Some researchers may be interested in adopting the first step of RegARD to other 3D registration or modeling methods. Another research opportunity is transforming the rigid indoor 3D point clouds into flexibly constrained networks of segmented rooms and spaces. One possible direction for practitioners is to migrate the RegARD method efficiently from single-threading computer CPUs to smart devices' ARM architectures. The authors are optimistic that digital twin buildings in the future will be an approachable and functional reality rather than a rhetoric term.

Funding:
The work presented in this paper was financially supported by the Research Grants Council (RGC) of Hong Kong SAR under grant numbers 17200218 and 27200520. The APC was funded by the RGC.

Data Availability Statement:
The source code of the latest version of RegARD is available at https://github.com/eiiijiiiy/RegARD (Updated on 30 April 2021). The test dataset is available in the source code. Local data are available from the corresponding author upon reasonable request.