Automatic Point-Cloud Registration for Quality Control in Building Works

: Total and automatic digitalization of indoor spaces in 3D implies a great advance in building maintenance and construction tasks, which currently require visits and manual works. Terrestrial laser scanners (TLS) have been widely used for these tasks, although the acquisition methodology with TLS systems is time consuming, and each point cloud is acquired in a different coordinate system, so the user has to post-process the data to clean and get a unique point cloud of the whole scenario. This paper presents a solution for the automatic data acquisition and registration of point clouds from indoor scenes, designed for point clouds acquired with a terrestrial laser scanner (TLS) mounted on an unmanned ground vehicle (UGV). The methodology developed allows the generation of one complete dense 3D point cloud consisting of the acquired point clouds registered in the same coordinate system, reaching an accuracy below 1 cm in section dimensions and below 1.5 cm in walls thickness, which makes it valid for quality control in building works. Two different study cases corresponding to building works were chosen for the validation of the method, showing the applicability of the methodology developed for tasks related to the control of the evolution of the construction.


Introduction
The demand for 3D models of building interiors has grown in the recent years due to (1) the need to have as-built 3D models available for tasks related to planning and maintenance in buildings and (2) the proliferation of the building information modelling (BIM) standard [1][2][3][4]. The most common techniques for the generation of 3D models have been photogrammetry [5][6][7] and laser scanning [8,9]. Regarding the first, evolution has come with advances in computer vision and structure-from-motion techniques. These made possible the computation of the orientation and self-calibration parameters of a high number of images regardless the a priori knowledge of position and orientation. However, aspects such as presence of homogeneous surfaces (i.e., textureless), changes in illumination, the need to scale the model and the computational cost have been the main drawbacks when generating quality 3D models in indoor scenes with photogrammetry. With respect to the second technique, terrestrial laser scanners (TLS) allow the generation of high-precision 3D models [10][11][12][13], but the acquisition requires meticulous planning, together with a high number of acquisition sets, and the application of an efficient registration technique to join all the point clouds acquired in the same point cloud, within a unique coordinate system. This issue has been studied several times in the last years and several solutions have been developed. Table 1 shows a comparison between different approaches attending to five different aspects: (i) the need of artificial targets, (ii) the need of RGB images, (iii) the number of point clouds that can be aligned with each method (i.e., only two point clouds or several point clouds), (iv) the minimum overlap between scans and v) the sensitivity to Appl. Sci. 2021, 11, 1465 2 of 21 noise. For example, [14] developed a fully automatic registration method of point clouds using special targets attached to the objects. The algorithm searches the specific templates of these targets using radiometric and geometric information (shape, size and planarity) and evaluates invariant parameters between each scan to calculate the registration. Although this method works, it depends on the placement of artificial targets in the scene. Placing targets is not always possible, and it requires previous works and detailed studies of the environment. Other studies have evaluated the registration using shape patterns between scans, such as planes [15]. This way, two point clouds are coarsely aligned using plane extraction to find tie points. This method is time-consuming and works only for pairs of point clouds, not being directly applicable to a set of scans. [16] implemented a registration method between scans based on linear and planar shapes to find homologous points and apply a coarse alignment. In [17], 2D images were used in combination with 3D models to improve point clouds registration. Two-dimensional images are used to add texture to the point clouds and to provide information about common features through their search with the SURF descriptor. With this, the matrix transformation is created and refined with different strategies such as RANSAC. This method reaches sufficient results only when the overlap between scans is good. To solve this problem, [18] presented a novel methodology to register outdoor and indoor point clouds using a two-step process: first, they extracted common features using the RGB information acquired by a digital camera and then they refined the alignment using iterative closest points (ICP) or planes matching techniques, that work properly even when the overlap between scans is poor. Both methods are robust and suitable for indoor and outdoor scenarios, but a hybrid calibrated system composed by an RGB camera and a laser scanner is mandatory. Moreover, both methods are based on the use of RGB images and textures, so it is not applicable for scans without that information. A fully image-based methodology was proposed by [19]. In this work, a four-step process is performed: (1) creation of a scan network; (2) coarse alignment; (3) filtering outliers and (4) fine alignment based on the minimum loop expansion (MLE) method. The authors use panoramic images associated to each scan to identify common features between them, so it is not possible to use this methodology for scans without these images or textures. In a similar line, [20] proposed a methodology to organise the scans in a network based on the correspondences detected between pairs of scans. Once the scans are connected, a coarse registration is applied and finally a fine registration performs the alignment process. Although the process is effective, the accuracies achieved are between 9 and 32 mm, values similar to those currently achieved with mobile laser scanner (MLS) systems. Table 1. Comparison between different registration approaches, including our proposed method.

Target
Based [14] Plane Based [15,16] Image Based [17] Image Based + Network [18][19][20] Proposed Method Target Need X Image Need X X Only Pairwise Registration X X Minimum Overlap Required X X X X X Noise Sensitive X X X X X MLS allow the generation of 3D models in a unique acquisition, with the direct result of a single point cloud, consequently with one coordinate system [21,22]. However, the guarantee of metric quality of the 3D models depends on the knowledge of the trajectory of the MLS, which is a complicated task in indoor scenes due to the low coverage of the GNSS signal. Simultaneous localization and mapping (SLAM) techniques are an efficient alternative for the computation of the trajectory but do not guarantee the precision required in applications such as quality control during construction works. The reason is that most MLS systems are equipped with low-cost and lower performance laser devices in order to cope with the weight requirements [23], such as Velodyne HDL-32 or Hokuyo UTM in Zeb-Revo system, that present maximum nominal precision between 2 and 3 cm [24], not comparable with the millimetre-precision of TLS systems [25]. Di Filippo et al. [21] use a MLS to create the 3D models of existing buildings in a short time. However, the errors are between 1-3 cm, which is not acceptable for control tasks in building works, where the maximum deviation allowed for transversal sections in columns [26] is ±10 mm.
Providing the highest precision of TLS compared to some MLS, TLS has been used for quality control of built environments in repeated occasions, in combination with artificial referencing systems such as artificial targets for the semi-automatic registration of the different point clouds acquired [22]. This requires the intervention of the user in some tasks, especially in the planning of the acquisition and the registration of point clouds into the same coordinate system.
Regarding user intervention with MLS, the systems in the market that are adequate for indoor mapping require to be pushed or carried by the user [23]. Thus, their use is limited in scenarios with elements of risk such as obstacles and hanging pieces, as well as in difficult-access areas. This limits their use in most "under construction" scenarios.
Trying to minimize the user intervention and achieve better results with SLAM techniques and high-quality point clouds, [27] presented a SLAM solution for robotic indoor mapping registration in which static and dynamic scans are used together to achieve optimal results. Static scans provide high quality point clouds, and dynamic scans provide the trajectory and the 3D point clouds of hidden areas between static scans. This technique allows us to generate high-quality point clouds in an automatic way using SLAM algorithms, but the global accuracy of the process is between 25 and 50 mm, exceeding the limits allowed for quality control of works in construction [26]. This global accuracy is formed by around 3 mm error from the nominal accuracy of the LiDAR used (SICK 2D laser scanner), and consequently, it can be deduced that more than a 20 mm error is introduced by the data processing.
Thus, the generation of quality and complete 3D point clouds from building interiors requires the use of TLS because of their high precision and the dynamic performance of the measurements in order to cover all the scenes. However, the methodology that exploits the advantages of this combination has not been established yet. For this reason, this paper is focused on two main goals: (i) minimizing the user interaction during the infield works and the processing tasks and (ii) establishing an automatic workflow to register all the scans without any target on the scene and without user intervention. Taking this into account, this paper presents a new methodology for the automatic registration of point clouds acquired with an unmanned ground vehicle (UGV). Nevertheless, this approach is applicable to point clouds coming from other photogrammetric or laser scanning sensors. The results are subjected to their use in the quality control of the different phases of a construction work.
Regarding point cloud registration, different approaches have been developed in the scientific community.
The pre-alignment of point clouds is usually performed based on detectors and descriptors, that is, algorithms that detect points of interest on 3D and describe them according to their geometric or radiometric characteristics. There are several 3D detectors and descriptors, focusing on different special characteristics [28,29]. Thus, the optimal detectors and descriptors depend on the particularities of each case [30], and their selection affects the result.
The detector scale invariant feature transform (SIFT) [31] has been widely used in automatic registration tasks, since it is not affected by the scale. Although it was initially developed to work with 2D information from images, it was adapted by [32] to operate in 3D using the principal curvature of the points in the role of the intensity of the pixels. [33] used the SIFT detector to automatically register 3D point clouds without artificial targets, but the test was performed in a single building without surroundings. [34] developed a methodology in which the 3D point cloud and the associated RGB /intensity images were used together to compute the registration. Their method was highly sensitive to the size of the overlapping area and depended on the acquisition of images associated to the scans. Moreover, the SIFT detector requires a lot of time to complete the processing, and its performance is optimal for the registration of two point clouds [30] but does not perform correctly for registering a higher number of point clouds, as is the case in this study. Similar results are obtained with the normal aligned radial feature (NARF) detector, which takes advantage of the key points in the borders and significant geometric structures [30]. The set of resultant key points is poor, so it works for the alignment of pairs of point clouds individually.
The Harris 3D detector detects corners and edges through the analysis of the relation of each point with its neighbours. A set of point rings is generated for each group of points, and their centroid is determined. A paraboloid is adjusted to the neighbour points in each centroid, and PCA is applied to compute the mean square surface of the paraboloid. Finally, the evaluation of the changes in the surface is performed through the analysis of the derivatives of the surface, in combination with a continuous Gaussian function [35,36]. The Harris 3D detector is invariant to rotation, scale, variations in luminosity and presence of noise.
Once the feature points have been detected, a describing process is performed to extract the corresponding points between the point clouds. The description process analyses each point in relation to its neighbours, focusing on its geometric or radiometric properties. Thus, the signature of histograms of orientations (SHOT) descriptor [37,38] splits into several bins a sphere that is centred at a feature point and collects the histogram of normal angles in each bin to build the descriptor. Other descriptors such as USC or 3DSC [39] also consider a sphere superimposed on the feature point and divides it into smaller segments. In these cases, the number of points contained is computed at each segment and weighted inversely by the segment density. These descriptors are not robust enough in environments with presence of noise [39]. Ref [40] developed the point feature histogram (PFH) descriptor, which besides searching the correspondences between points, classifies them in primitive groups such as corners, edges or planes [29].
It should be highlighted that the scenarios of construction works present similarities and repetitive patterns, especially in the early stages of the works where the predominant elements are pillars, walls, doors and windows, all presenting similar shapes. Moreover, the scans of these scenarios are affected by illumination changes. All these factors complicate the process of detection and characterization of corresponding points between point clouds, provided that two different points from different point clouds can be assigned very similar descriptions and be incorrectly considered as corresponding due to their similar characteristics. In order to avoid all the previous drawbacks and to develop a methodology that is easy to use, the detector selected was Harris 3D [41,42] in combination with the point feature histogram (PFH) descriptor.
The paper is structured as follows: after the Introduction, Section 2 presents the methodology developed and the technology used; Section 3 shows the results obtained after the application of the methodology to the quality control of the construction works based on statistical analysis with robust estimators. Section 4 describes the conclusions extracted during the work.

Materials
Unnamed Ground Vehicle (UGV) and Terrestrial Laser Scanner (TLS) Data acquisition for this work was performed with a TLS, Faro Focus 3D 330, mounted on a UGV controlled by an automatic route planner [43]. The path planner uses a 2D drawing of the building as input data, being able to calculate the number of stations needed and the positions of each scan point according to the overlap and the accuracy established [43]. Using this approach, the UGV can work autonomously, without supervision, following the path planned and operating at night to avoid interruptions during the building tasks. The TLS Faro Focus 3D 330 ( Figure 1) is used as a static scanning system through a Stop&Go process. In the Stop&Go mode [44], the TLS captures data when the UGV is stopped, mak-Appl. Sci. 2021, 11, 1465 5 of 21 ing a conventional static TLS measurement. Once the scan is finished, the UGV continues its route to the next target point, where another scan is performed. Thus, the TLS is fully autonomous and acquires data with a precision of ±2 mm at 10 m.
length of 1550 nm. The field of view covers 300° vertically and 360° horizontally, with a resolution of 0.009° and a measurement rate of 122,000-976,000 points per second, also registering radiometric information for each point. The nominal precision of the TLS is ±2 mm at 10 m distance in normal illumination conditions to a 90% reflective surface, with a ray divergence of 0.19 mrad. All these characteristics ensure that the quality standards in construction can be achieved and determine its selection as the scanning system. The Faro Focus system has been mounted on an UGV, Guardian Robotnik, with a payload of 100 kg, speed of 3 m/s, autonomy between 3 and 10 h in normal performance and capacity to detect occlusions and obstacles using two laser scanner SICK S300 Expert integrated at the front and the back of the UGV. Additionally, the path planning developed by [43] was incorporated to give the UGV full autonomy in the acquisition tasks.

Method
The method developed for the automatic registration follows an execution line structured in different steps ( Figure 2). The laser data acquired with the robotic system (Step 1) are subjected to an automatic pre-processing of the laser data (Step 2). The pre-pro- The Faro Focus 3D 330 consists of an infrared laser scanner that measures coordinates directly using the phase difference of the laser ray in the range of 0.60-330 m in a wavelength of 1550 nm. The field of view covers 300 • vertically and 360 • horizontally, with a resolution of 0.009 • and a measurement rate of 122,000-976,000 points per second, also registering radiometric information for each point. The nominal precision of the TLS is ±2 mm at 10 m distance in normal illumination conditions to a 90% reflective surface, with a ray divergence of 0.19 mrad. All these characteristics ensure that the quality standards in construction can be achieved and determine its selection as the scanning system. The Faro Focus system has been mounted on an UGV, Guardian Robotnik, with a payload of 100 kg, speed of 3 m/s, autonomy between 3 and 10 h in normal performance and capacity to detect occlusions and obstacles using two laser scanner SICK S300 Expert integrated at the front and the back of the UGV. Additionally, the path planning developed by [43] was incorporated to give the UGV full autonomy in the acquisition tasks.

Method
The method developed for the automatic registration follows an execution line structured in different steps ( Figure 2). The laser data acquired with the robotic system (Step 1) are subjected to an automatic pre-processing of the laser data (Step 2). The pre-processing consists of a coarse noise filtering based on the analysis of the coordinate histogram, followed by a homogenization of the resolution of the resultant point clouds through voxelization [45] and a filtering process. Then, the registration strategy developed is applied (Steps 3 and 4): (i) coarse registration of each point cloud based on 3D detectors and descriptors (Step 3) [46], (ii) fine registration with iterative closest point (ICP) technique (Step 4) [47]. The result is one 3D point cloud joining all the point clouds acquired into one unique coordinate system, obtained with no user interaction. The last step of the methodology is the quality control of the construction works focused on certain constructive elements (Step 4).
cessing consists of a coarse noise filtering based on the analysis of the coordinate histogram, followed by a homogenization of the resolution of the resultant point clouds through voxelization [45] and a filtering process. Then, the registration strategy developed is applied (Steps 3 and 4): (i) coarse registration of each point cloud based on 3D detectors and descriptors (Step 3) [46], (ii) fine registration with iterative closest point (ICP) technique (Step 4) [47]. The result is one 3D point cloud joining all the point clouds acquired into one unique coordinate system, obtained with no user interaction. The last step of the methodology is the quality control of the construction works focused on certain constructive elements (Step 4).

Path Planning
The UGV is equipped with an automatic path planner able to calculate the optimal route for the data acquisition. The workflow of the route planner [43] is shown in Figure 3.

Path Planning
The UGV is equipped with an automatic path planner able to calculate the optimal route for the data acquisition. The workflow of the route planner [43] is shown in Figure 3. and descriptors (Step 3) [46], (ii) fine registration with iterative closest point (ICP) technique (Step 4) [47]. The result is one 3D point cloud joining all the point clouds acquired into one unique coordinate system, obtained with no user interaction. The last step of the methodology is the quality control of the construction works focused on certain constructive elements (Step 4).

Path Planning
The UGV is equipped with an automatic path planner able to calculate the optimal route for the data acquisition. The workflow of the route planner [43] is shown in Figure 3.  This planner takes the 2D CAD files of each floor extracted from the BIM model as input data. Constructive elements, such as walls, columns or floors are grouped in layers. Then, each layer is discretized into points with the same resolution (configurable parameter), and a binary occupancy map is performed. A classification of elements based on its non-navigable position (e.g., holes and constructive elements) and navigable position is performed. Next step estimates the candidate positions where the scanner device can make a scan by performing a grid distribution. The dimensions of the UGV are taken in consideration in this step leaving a security margin of 1 m near each building element.
The output of this step is a map with candidate positions, construction element points and no candidate positions. Before designing the optimal route, a visibility analysis on each candidate position is performed in order to know the theoretical area of building elements that would be captured. The visibility analysis is made by a ray-tracing algorithm in combination with the maximum scanner range. The scanner range is a criterion to ensure good density and quality of the acquisition data and is used as the limit of each candidate position analysis. For the study case, the maximum scanner range was 10 m. Finally, an optimization of candidate positions is performed by using the back-tracking algorithm [48]. This algorithm minimizes the scan points ensuring the minimum coverage of the elements selected. For the study case, the minimum coverage was 90%. Once the positions have been optimized, the optimal route is created using the travelling salesman problem followed by the ant colony optimization algorithm [49]. The optimal route is the path with the minimum number of scans that can cover the whole scenario with the setup parameters defined.
During the execution of the route, the path planner is constantly working for ensuring the good performance of the route. In fact, if any scan point is not reachable due to an obstacle (very common in building works) the path planner re-calculates an alternative scan point and thus the resulting route. The calculation of this alternative trajectory is carried out on the fly immediately when the impossibility of reaching a scan point is detected. As a result, the UGV can automatically acquire the data, guaranteeing the setup parameters conditions without user interaction.

Pre-Processing
Commonly, the point clouds acquired by 3D laser scans contain noise. The noise is composed by points measured with positional errors. These errors are due to vibrations, reflections, far points or moving elements during the scan works, so it is important to filter them to prevent their influence in the rest of the calculus.
The first operation is an automatic noise elimination using statistical outlier removal (SOR) filter [50]. The SOR filter automatically eliminates the noise present in the point clouds by performing a proximity analysis for each point with the neighbour points ( Figure 4). For this case, the fast statistical outlier removal (FSOR) proposed by Balta et al. [51] was used to minimize the processing time and optimize resources. input data. Constructive elements, such as walls, columns or floors are grouped in layers. Then, each layer is discretized into points with the same resolution (configurable parameter), and a binary occupancy map is performed. A classification of elements based on its non-navigable position (e.g., holes and constructive elements) and navigable position is performed. Next step estimates the candidate positions where the scanner device can make a scan by performing a grid distribution. The dimensions of the UGV are taken in consideration in this step leaving a security margin of 1 m near each building element. The output of this step is a map with candidate positions, construction element points and no candidate positions. Before designing the optimal route, a visibility analysis on each candidate position is performed in order to know the theoretical area of building elements that would be captured. The visibility analysis is made by a ray-tracing algorithm in combination with the maximum scanner range. The scanner range is a criterion to ensure good density and quality of the acquisition data and is used as the limit of each candidate position analysis. For the study case, the maximum scanner range was 10 m. Finally, an optimization of candidate positions is performed by using the back-tracking algorithm [48]. This algorithm minimizes the scan points ensuring the minimum coverage of the elements selected. For the study case, the minimum coverage was 90%. Once the positions have been optimized, the optimal route is created using the travelling salesman problem followed by the ant colony optimization algorithm [49]. The optimal route is the path with the minimum number of scans that can cover the whole scenario with the setup parameters defined.
During the execution of the route, the path planner is constantly working for ensuring the good performance of the route. In fact, if any scan point is not reachable due to an obstacle (very common in building works) the path planner re-calculates an alternative scan point and thus the resulting route. The calculation of this alternative trajectory is carried out on the fly immediately when the impossibility of reaching a scan point is detected. As a result, the UGV can automatically acquire the data, guaranteeing the setup parameters conditions without user interaction.

Pre-Processing
Commonly, the point clouds acquired by 3D laser scans contain noise. The noise is composed by points measured with positional errors. These errors are due to vibrations, reflections, far points or moving elements during the scan works, so it is important to filter them to prevent their influence in the rest of the calculus.
The first operation is an automatic noise elimination using statistical outlier removal (SOR) filter [50]. The SOR filter automatically eliminates the noise present in the point clouds by performing a proximity analysis for each point with the neighbour points (Figure 4). For this case, the fast statistical outlier removal (FSOR) proposed by Balta et al. [51] was used to minimize the processing time and optimize resources.  After cleaning the point cloud, a passthrough filter [52,53] based on the coordinate histogram of the point clouds is applied. Because the Faro Focus captures data in the range 0.60-330 m, the system measures points that are too far from the scan station and consequently useless for the application (Figure 5a,c). The first automatic filter applied removes the farthest points computing the coordinates distribution along the different axis (x, y, z). For each axis, the coordinates are converted to scalar field, and its histogram of distribution is filtered according to the position with highest accumulation of points in ( Figure 5b,d).
Finally, a homogenization of the point cloud is performed. It is well known that 3D point clouds acquired with a laser scanner present heterogeneous point density. The further away from the station point, the lower point density. The voxel-grid filter [53,54] allows the generation of a uniform dataset by dividing the space in cubes of custom size and replacing the points on each cube by the centroid of that set of points. The voxelization [45] is performed in an automatic way: the size of the voxel depends on the density of the point cloud, which is calculated through the study of the mean distance of each point to its closest neighbours. In this way, the process is completely automated, as well as adaptative to the data of each scenario.

Alignment
Once the point clouds are pre-processed, their pre-alignment is performed based on detectors and descriptors. In the proposed methodology, the detector selected was Harris 3D [41,42] in combination with the point feature histogram (PFH) descriptor. Finally, an ICP alignment was applied.
The Harris 3D detector presents good results in construction scenarios, since its characteristics are optimal for the identification of points in pillars, vertical walls and ceilings, by its corners and edges. It is also fast and detects large sets of interest points with a good correspondence [55]. The Harris 3D detector allows for several variations depending on the evaluation of the trace and the determinant (det) of the covariance matrix (Cov) [41]. Finally, a homogenization of the point cloud is performed. It is well known that 3D point clouds acquired with a laser scanner present heterogeneous point density. The further away from the station point, the lower point density. The voxel-grid filter [53,54] allows the generation of a uniform dataset by dividing the space in cubes of custom size and replacing the points on each cube by the centroid of that set of points. The voxelization [45] is performed in an automatic way: the size of the voxel depends on the density of the point cloud, which is calculated through the study of the mean distance of each point to its closest neighbours. In this way, the process is completely automated, as well as adaptative to the data of each scenario.

Alignment
Once the point clouds are pre-processed, their pre-alignment is performed based on detectors and descriptors. In the proposed methodology, the detector selected was Harris 3D [41,42] in combination with the point feature histogram (PFH) descriptor. Finally, an ICP alignment was applied.
The Harris 3D detector presents good results in construction scenarios, since its characteristics are optimal for the identification of points in pillars, vertical walls and ceilings, by its corners and edges. It is also fast and detects large sets of interest points with a good correspondence [55]. The Harris 3D detector allows for several variations depending on the evaluation of the trace and the determinant (det) of the covariance matrix (Cov) [41]. In all the variations, the response of the key points r(x, y, z) is estimated, but different criteria are applied in each case. The original Harris 3D detector is performed using Equation (1), while Harris 3D Lowe follows Equation (2) and the variant Harris 3D Noble ( Figure 6) applies Equation (3), where the trace of the covariance matrix is not squared.
r(x, y, z) = det(Cov(x, y, z) − k(trace(Cov(x, y, z))) 2 r(x, y, z) = det(Cov(x, y, z)) trace(Cov(x, y, z)) 2 (2) r(x, y, z) = det(Cov(x, y, z)) trace(Cov(x, y, z)) , , = det ( , , − ( , , ) ( , , ) = det ( ( , , )) ( ( , , )) (2) ( , , ) = det ( ( , , )) ( ( , , )) These changes among Harris 3D variants modify the evaluation of the relation between the determinant and the trace of the covariance matrix, which affects in the result of the point detection [42]. The Harris 3D method includes a constant k that depends on the point cloud data, while Harris 3D Lowe and Harris 3D Noble do not consider this factor, being independent of the input data [56]. This fact led to the selection of Harris 3D Noble as an optimal point detector in the methodology developed. Since the detector used (Harris 3D) focuses mainly on detecting corners and edges, and because the PFH descriptor is more robust against changes in the viewpoints and delivers reliable information between the point clouds [30,41], the latter was chosen for this work, since it is common to have different points of view, illumination changes and good overlap between scans. These changes among Harris 3D variants modify the evaluation of the relation between the determinant and the trace of the covariance matrix, which affects in the result of the point detection [42]. The Harris 3D method includes a constant k that depends on the point cloud data, while Harris 3D Lowe and Harris 3D Noble do not consider this factor, being independent of the input data [56]. This fact led to the selection of Harris 3D Noble as an optimal point detector in the methodology developed.
Since the detector used (Harris 3D) focuses mainly on detecting corners and edges, and because the PFH descriptor is more robust against changes in the viewpoints and delivers reliable information between the point clouds [30,41], the latter was chosen for this work, since it is common to have different points of view, illumination changes and good overlap between scans.
The 3D PFH descriptor describes each point with its neighbours generalizing the normal vectors (n) and the mean curvature (α, φ, θ), and representing these values in a histogram. The histogram will have a unique and invariant signature for each point. Providing two points, p and q, a reference system can be created consisting of three unit vectors (u, v, w) [57]. Thus, the difference between normal vectors in points p (n p ) and q (n q ) is determined with three angular variables (Equations (4)-(6)) where d represents the distance between p and q. α = arc cos(v·n q ) (4) Being: The three angles are coded with distance, and together they form the final descriptor, with the concatenation of the histograms of the four variables.
The 3D PFH descriptor is applied to all points identified as feature points with the Harris 3D Noble detector, in such way that each point is associated with a description, and the coincidence of descriptions is used for the determination of correspondences between points from different point clouds. The identification of corresponding points allows the computation of the transformation between point clouds to the same coordinate reference system. In addition, the description of each point allows its classification into the different primitives such as edges, corners or planes.
The invariance to rotation, scale, variations in luminosity and presence of noise is the reason for the selection of the couple Harris 3D-PFH for the study of indoor building scenes. This results in the efficient detection of points of interest in corners and borders and the assignment of correspondences between feature points of different point clouds. However, for those cases with repetitive elements such as pillars, some erroneous correspondences can appear. In order to minimize this effect, the methodology proposed includes a restriction of maximum distance between corresponding points. This restriction avoids the determination of erroneous points correspondences produced by the similarity between different zones.
The filter that discriminates according to the distance between points is graphically explained in Figure 7a. Its performance is as follows: Corresponding points for each point of interest are searched within a distance threshold. This search, together with the initial results of the detector/descriptor combination, allows us to refine the registration between point clouds, thus improving the results. This step is reiterated until the root mean square error (RMSE) is below a threshold established according to the criteria for quality control in the works. The purpose of this study is to check and validate sections of the building based on variations and translations of the built elements. For the first case, the tolerances allowed are ±10 mm and for the second case ±24 mm [26]. To ensure that all the data meet these tolerances, the quality threshold is set to 5 mm.
Once the Harris 3D detector and PFH descriptor have been computed and a first coarse alignment based on features has been applied to all the point clouds, some of them are correctly aligned, but some others are not. This is due to the similarity between scans, which can cause imperfections in the final alignment. However, the point clouds are approximated to their real positions and the generation of one unique 3D point cloud formed by all the point clouds can be completed with the final adjustment of the registration, which eliminates the errors. The final adjustment is performed using the ICP algorithm [58]. ICP consists of an iterative process in which the distance error between point clouds is minimized until the point clouds are perfectly aligned. Each iteration consists of three steps: (i) finding pairs of corresponding points; (ii) estimation of the transformation that minimizes the distance between corresponding points; (iii) application of the transformation to the point clouds. Since ICP is an iterative algorithm, it is mandatory to have a good initial transformation to make its convergence possible [59]. This initial estimation comes from the coarse alignment, which already makes a good estimation of the point clouds positions.
good initial transformation to make its convergence possible [59]. This initial estimation comes from the coarse alignment, which already makes a good estimation of the point clouds positions.
It should be highlighted that the registration method based on the compatibility between normal vectors (Figure 7b) is not applicable in these scenarios, due to the presence of repetitive structures with similar characteristics (squared pillars, vertical walls, doors, among others) that resulted in similar descriptions based on the normal vector in spite of their difference. This invalidity of the normal vector method was analysed through its application to the cases of study and shown through the lack of convergence regarding the ICP algorithm. The ICP algorithm can be applied in a point-to-point or in a point-to-plane mode. The first performs the adjustment between point clouds by minimizing the distance between one point in the point cloud of reference and its corresponding point in the point cloud to register. The second adjusts the point cloud through the minimization of the distance between the point in the point cloud of reference and the tangent plane to its corresponding point in the point cloud to register. Studies shown that the validity of point-to-point ICP is more robust to Gaussian noise than point-to-plane ICP [60]. However, the latter undergoes fewer iterations, and it is more effective [60]. Since the point clouds are cleaned and filtered from noise before starting the registration process, the point-to-point ICP method was chosen for this work.
The main advantage of the double step Harris 3D/PFH and point-to-point ICP registration is that there is no need of user interaction, and there is no need to place any targets in the scene to serve as points of interest, because the points of interest in the method proposed are naturally present in the scene.

Case Study
The validation of the methodology proposed is performed through its application to two different scenarios of a residential building located in Badalona (Spain). The building has asymmetric U-shaped floor plan (Figure 8a). In the moment of the acquisition, the building was being constructed with a life expectation of 50 years. The plan is 25,000 m 2 and is vertically distributed in two basements dedicated to parking, one base floor dedicated to commercial offices and the upper floors destined to housing. The building is divided in two blocks, north and south, with 14 and 13 floors, respectively. It should be highlighted that the registration method based on the compatibility between normal vectors (Figure 7b) is not applicable in these scenarios, due to the presence of repetitive structures with similar characteristics (squared pillars, vertical walls, doors, among others) that resulted in similar descriptions based on the normal vector in spite of their difference. This invalidity of the normal vector method was analysed through its application to the cases of study and shown through the lack of convergence regarding the ICP algorithm.
The ICP algorithm can be applied in a point-to-point or in a point-to-plane mode. The first performs the adjustment between point clouds by minimizing the distance between one point in the point cloud of reference and its corresponding point in the point cloud to register. The second adjusts the point cloud through the minimization of the distance between the point in the point cloud of reference and the tangent plane to its corresponding point in the point cloud to register. Studies shown that the validity of point-to-point ICP is more robust to Gaussian noise than point-to-plane ICP [60]. However, the latter undergoes fewer iterations, and it is more effective [60]. Since the point clouds are cleaned and filtered from noise before starting the registration process, the point-to-point ICP method was chosen for this work.
The main advantage of the double step Harris 3D/PFH and point-to-point ICP registration is that there is no need of user interaction, and there is no need to place any targets in the scene to serve as points of interest, because the points of interest in the method proposed are naturally present in the scene.

Case Study
The validation of the methodology proposed is performed through its application to two different scenarios of a residential building located in Badalona (Spain). The building has asymmetric U-shaped floor plan (Figure 8a). In the moment of the acquisition, the building was being constructed with a life expectation of 50 years. The plan is 25,000 m 2 and is vertically distributed in two basements dedicated to parking, one base floor dedicated to commercial offices and the upper floors destined to housing. The building is divided in two blocks, north and south, with 14 and 13 floors, respectively.
The foundations of the building are constructed with a rectangular base, 200 × 45 cm and 260 × 45 cm size, individually or in groups of two elements with capped piles for load transmission. Retaining walls are made of concrete with temporary anchoring. In the moment of data acquisition, the building was under different periods of construction, allowing for the testing of the methodology proposed in different environments and complexities of construction works. Two floors were selected for the data acquisition: (i) Floor 1 in the north block, which presents 810 m 2 ( Figure 8b) and (ii) the ground floor in the south block, with 845 m 2 (Figure 8c). These floors were selected due to their condition of construction and geometrical complexity, which can be considered as representative for construction works. During acquisition, there were walls, pillars, holes (for the lift and drainpipes), floors and ceilings present in the scenes, as well as construction materials such as scaffolds and boards.

Data Acquisition
Data acquisition was performed with the UGV described in Section 2.1. For the first study case, seven stations were needed, in the Stop&Go strategy, for the complete acquisition (Figure 9b). For the second study case, a total of 11 stations were needed to cover all the spaces (Figure 9d). In the second study case, more stations were necessary, since this floor was in a more advanced state of construction and there were more occlusions between elements. The position of the stations was optimized by the path planning programmed in the UGV [43] in order to minimize the trajectories (Figure 9a,c), acquiring all the information required. Since both floors were in the construction phase (Figure 9b), the trajectory had the requirement of including all the pillars in the data acquisition. Resolution and quality of the data acquired were adjusted in a compromise to obtain all the information needed and to minimize acquisition time and quantity of data generated. Thus, 60% overlap between stops was established, together with a scanning resolution of 4 mm at 20 m. The overlap percentage was selected towards the optimization of the automatic registration process in terms of precision and reliability. In addition, providing the minimum acquisition range of the TLS (0.6 m), all points at a distance below 0.8 m from the TLS were discarded from the acquisition (Table 2) In the moment of data acquisition, the building was under different periods of construction, allowing for the testing of the methodology proposed in different environments and complexities of construction works. Two floors were selected for the data acquisition: (i) Floor 1 in the north block, which presents 810 m 2 ( Figure 8b) and (ii) the ground floor in the south block, with 845 m 2 (Figure 8c). These floors were selected due to their condition of construction and geometrical complexity, which can be considered as representative for construction works. During acquisition, there were walls, pillars, holes (for the lift and drainpipes), floors and ceilings present in the scenes, as well as construction materials such as scaffolds and boards.

Data Acquisition
Data acquisition was performed with the UGV described in Section 2.1. For the first study case, seven stations were needed, in the Stop&Go strategy, for the complete acquisition (Figure 9b). For the second study case, a total of 11 stations were needed to cover all the spaces (Figure 9d). In the second study case, more stations were necessary, since this floor was in a more advanced state of construction and there were more occlusions between elements. The position of the stations was optimized by the path planning programmed in the UGV [43] in order to minimize the trajectories (Figure 9a,c), acquiring all the information required. Since both floors were in the construction phase (Figure 9b), the trajectory had the requirement of including all the pillars in the data acquisition. Resolution and quality of the data acquired were adjusted in a compromise to obtain all the information needed and to minimize acquisition time and quantity of data generated. Thus, 60% overlap between stops was established, together with a scanning resolution of 4 mm at 20 m. The overlap percentage was selected towards the optimization of the automatic registration process in terms of precision and reliability. In addition, providing the minimum acquisition range of the TLS (0.6 m), all points at a distance below 0.8 m from the TLS were discarded from the acquisition ( Table 2)

Parameter
Value Mean laser range (m) 8 Security distance (m) 0.8 % Scanning overlap 60% Scanning resolution 4 mm at 20 m The time required per scan position was approximately 3 min. This value should be added to the time required in the trajectory between stops, for a total acquisition time of 25 and 45 min for the first and second study cases, respectively.
For the first and second study cases, the result of the acquisition was six and eleven point clouds with 24 million points each, respectively. To have a homogeneous and clean point cloud in the next steps, the raw point clouds were filtered as described in Section 2.2. In this application, the voxel size was established at 0.5 cm, resulting in point clouds about 7 million points each and with homogeneous data distribution. The UGV can operate at night and can go over piles of rubble as well as muddy and wet surfaces thanks to its caterpillar wheels.

Automatic Registration
The automatic registration procedure described in Section 2.2 was applied to the six and eleven TLS point clouds acquired for each study case, after the application of the filtering procedure. That is, (i) one coarse registration based on the Harris 3D Noble detector  The time required per scan position was approximately 3 min. This value should be added to the time required in the trajectory between stops, for a total acquisition time of 25 and 45 min for the first and second study cases, respectively.
For the first and second study cases, the result of the acquisition was six and eleven point clouds with 24 million points each, respectively. To have a homogeneous and clean point cloud in the next steps, the raw point clouds were filtered as described in Section 2.2. In this application, the voxel size was established at 0.5 cm, resulting in point clouds about 7 million points each and with homogeneous data distribution. The UGV can operate at night and can go over piles of rubble as well as muddy and wet surfaces thanks to its caterpillar wheels.

Automatic Registration
The automatic registration procedure described in Section 2.2 was applied to the six and eleven TLS point clouds acquired for each study case, after the application of the filtering procedure. That is, (i) one coarse registration based on the Harris 3D Noble detector and the PFH descriptor with which the point clouds are positioned close to their real position (Figure 10b) in the shared coordinate system; (ii) one fine registration procedure based on ICP for the positioning of the point clouds with millimetre precision (Figure 10c). The final 3D point clouds, after removing duplicated points, are formed by 44 million points and 73 million points for the first and second study cases, respectively.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 14 of 21 and the PFH descriptor with which the point clouds are positioned close to their real position (Figure 10b) in the shared coordinate system; (ii) one fine registration procedure based on ICP for the positioning of the point clouds with millimetre precision (Figure 10c). The final 3D point clouds, after removing duplicated points, are formed by 44 million points and 73 million points for the first and second study cases, respectively. Regarding processing time, the total time for the automatic procedure required 23 and 48 min for the first and second study cases, respectively, in a PC Mountain Studio3D i5 Ivy with Intel Core i5-3570K processor and 32 Gb RAM with operating system Windows 10 64 bits, and no need for interaction by the user.

Quality Control
The BIM models (Figure 8b,c) were used as reference for the quality control. More precisely, the registration of the BIM and point clouds was done manually using a local data established by the company, which corresponds with the definition of a local origin and the direction of X and Y axes. The Z axis was always considered along the vertical. After that, two different approaches were performed for analysing the quality: (i) the models were compared with the designed building information model (BIM) to find visual discrepancies between as-built and BIM models; (ii) the accuracy of the point clouds was checked comparing against the BIM models. In particular, the accuracy of the point clouds was checked using two different dimensional analysis: (i) dimensional control of the pillars, (ii) distance between construction elements. For the dimensional control, the dimensions obtained from the as-built model, automatically registered, were compared to the dimensions measured in the reference BIM.
The reference BIM provides a theoretical model very close to the final result expected for the construction, and it is updated over the time. This contributes to making the quality control during the whole process. The BIM reflects the geometry, metrical properties, material properties and construction phases of each element, together with the final layout of the building [61]. This way, the BIM is considered as a theoretical model in this case and provides the reference measurements for the structural elements chosen to validate the methodology in each phase.
However, the point clouds registered with the automatic procedure proposed contain a high amount of information, and their statistical distribution is often non-normal [62]. Thus, the statistical analysis performed was based on nonparametric estimators of the median, m, such as the normalized median absolute deviation (NMAD) (Equation (7)) and the bi-weight mid-variance (BWMV) (Equation (8)). These parameters allow us to perform a comparison and, consequently, a more precise and reliable quality control. Regarding processing time, the total time for the automatic procedure required 23 and 48 min for the first and second study cases, respectively, in a PC Mountain Studio3D i5 Ivy with Intel Core i5-3570K processor and 32 Gb RAM with operating system Windows 10 64 bits, and no need for interaction by the user.

Quality Control
The BIM models (Figure 8b,c) were used as reference for the quality control. More precisely, the registration of the BIM and point clouds was done manually using a local data established by the company, which corresponds with the definition of a local origin and the direction of X and Y axes. The Z axis was always considered along the vertical. After that, two different approaches were performed for analysing the quality: (i) the models were compared with the designed building information model (BIM) to find visual discrepancies between as-built and BIM models; (ii) the accuracy of the point clouds was checked comparing against the BIM models. In particular, the accuracy of the point clouds was checked using two different dimensional analysis: (i) dimensional control of the pillars, (ii) distance between construction elements. For the dimensional control, the dimensions obtained from the as-built model, automatically registered, were compared to the dimensions measured in the reference BIM.
The reference BIM provides a theoretical model very close to the final result expected for the construction, and it is updated over the time. This contributes to making the quality control during the whole process. The BIM reflects the geometry, metrical properties, material properties and construction phases of each element, together with the final layout of the building [61]. This way, the BIM is considered as a theoretical model in this case and provides the reference measurements for the structural elements chosen to validate the methodology in each phase.
However, the point clouds registered with the automatic procedure proposed contain a high amount of information, and their statistical distribution is often non-normal [62]. Thus, the statistical analysis performed was based on nonparametric estimators of the median, m, such as the normalized median absolute deviation (NMAD) (Equation (7)) and the bi-weight mid-variance (BWMV) (Equation (8)). These parameters allow us to perform a comparison and, consequently, a more precise and reliable quality control. where: being the median absolute deviation (MAD) defined by Equation (11) and described as the median (m) of the absolute deviation of the median of the data (m x ) and x i the measurements performed on the registered point cloud. Table 3 shows the results of bias and dispersion for the discrepancies of TLS vs. BIM for the two study cases. Considering that the approach developed provides values admissible for quality control tasks, a comparison of measurements on the point cloud registered with the method and on the reference BIM measurements was performed. The elements where the errors in registration are most evident are pillars. Table 4 shows the admissible tolerances within the most relevant structural elements in a building according to [26]. Table 4. Tolerances in the different structural elements according to [26]. "D" represents the transversal dimension of the elements.

Tolerances in Construction Elements
Cross Section (D < 30) +10-8 mm Cross Section (30 < D < 100) +12-10 mm Cross Section (100 < D) +24-20 mm Vertical deviation outer edges columns ±6 mm Wall Thickness < 25 cm +12-10 mm Wall Thickness > 25 cm +16-10 mm The validation focuses on the dimensions and the cross sections of the pillars. These elements are predefined, making them adequate as control elements. Seven pillars of the first study case and seven pillars of the second study case were subjected to measurements of their width at three different heights: low height at 0.20 m, medium height at 1.20 m and high height at 2.50 m from the floor. Each measurement is performed five times, and robust statistical estimators are calculated from these measurements. Table 5 shows the theoretical measurements (performed on the BIM) in each pillar and the measurements performed on the as-built model resulting from our methodology. Table 6 shows the results of the robust statistical parameters calculated from the dimensional analysis (median and BWMV). Provided that the tolerance for pillar section control is established at 8-10 mm (Table 4), the results obtained with the methodology proposed are applicable in this type of scenario. An additional test of the dimensions of one pillar on each study case is performed by adjusting planes to each face of the pillars (Figure 11). This test is performed as an evaluation method of the quality of the results. This additional evaluation is necessary, because the point cloud is formed by discrete points, and the punctual measurements can present non-robust results, since they depend on the points used for the measurements.
The metrical verification in the planar case consists in computing the size of the pillars by measuring the distance between opposite sides of the pillars and comparing the distances measured with the real dimensions of the pillars (Table 7). Thus, the width of the pillar is denoted as d1 (distance between planes 3 and 4), and the length of the pillar is d2 (distance between planes 1 and 2), as shown in Figure 11c.

BIM Model
As  Provided that the pillars are the elements more influenced by the results of the registration, and given the results obtained for them, the automatic registration methodology proposed presents adequate quality and precision for its application in quality control of construction works: the methodology allows for the measurement of the pillars with an error below 10 mm and the performance of any additional measurement from the point cloud.

Conclusions
This paper presents a new methodology for the automatic acquisition and registration of point clouds for indoor scenarios in construction works. The main contributions of the presented approach are two-fold: (i) the possibility of automatically capturing point clouds based on the Stop&Go method and using an UGV; (ii) a methodology of point cloud registration fully automatic that works with any photogrammetric or laser scanning point clouds.
The methodology proposed presents an automatic data acquisition with an UGV equipped with a TLS and an automatic route planner able to calculate the best route and producing no interferences during the construction works. The different point clouds acquired are automatically registered using a combination of detector/descriptor. The integral methodology can be completely autonomous and automatic, with no need of artificial elements that disturb the construction scenario and no need of user interaction. The precision obtained in the automatic registration was evaluated through metric verifications on the resulting point cloud. The errors are below 1 cm in all measurements performed. This precision validates the use of the methodology in quality control scenarios in construction works for the identification and control of construction elements.
The methodology presented for point cloud registration consists of three steps: (i) pre-processing of the data filtering and homogenizing of the point clouds, (ii) a coarse registration and (iii) a fine registration. The data were acquired with an autonomous unnamed ground vehicle with the Faro laser scan mounted on it and applying the Stop&Go method. The UGV can operate at night and with no user intervention thanks to the path  Table 7. Result of the control measurements performed on pillar 3 (study case 1) and on the pillar 9 (study case 2). The measurements performed correspond to distances between opposite sides of the pillar ( Figure 11).

BIM Model
As Provided that the pillars are the elements more influenced by the results of the registration, and given the results obtained for them, the automatic registration methodology proposed presents adequate quality and precision for its application in quality control of construction works: the methodology allows for the measurement of the pillars with an error below 10 mm and the performance of any additional measurement from the point cloud.

Conclusions
This paper presents a new methodology for the automatic acquisition and registration of point clouds for indoor scenarios in construction works. The main contributions of the presented approach are two-fold: (i) the possibility of automatically capturing point clouds based on the Stop&Go method and using an UGV; (ii) a methodology of point cloud registration fully automatic that works with any photogrammetric or laser scanning point clouds.
The methodology proposed presents an automatic data acquisition with an UGV equipped with a TLS and an automatic route planner able to calculate the best route and producing no interferences during the construction works. The different point clouds acquired are automatically registered using a combination of detector/descriptor. The integral methodology can be completely autonomous and automatic, with no need of artificial elements that disturb the construction scenario and no need of user interaction. The precision obtained in the automatic registration was evaluated through metric verifications on the resulting point cloud. The errors are below 1 cm in all measurements performed. This precision validates the use of the methodology in quality control scenarios in construction works for the identification and control of construction elements.
The methodology presented for point cloud registration consists of three steps: (i) pre-processing of the data filtering and homogenizing of the point clouds, (ii) a coarse registration and (iii) a fine registration. The data were acquired with an autonomous unnamed ground vehicle with the Faro laser scan mounted on it and applying the Stop&Go method. The UGV can operate at night and with no user intervention thanks to the path planner. The resultant point cloud has enough quality, density and precision for quality control in building works (less than 1 cm in section dimensions and less than 1.5 cm in wall thickness).
For the coarse registration, a combination of Harris 3D detector and PFH descriptor was used, since this combination allows the best detection of feature points on corners, edges and planes. These parametric forms are the most common in building scenarios and provide robust results. The detector/descriptor method allows to be applied on any set of point clouds, captured with any laser scan system, using a UGV or not.
In addition, the chosen detector/descriptor-based method uses only geometric information to compute the coarse alignment. This means that it is not necessary to have radiometric information (RGB colour), making the methodology more scalable and suitable for many more study cases. In this line, the Harris 3D/PFH combination was chosen, because it does not need images to perform the calculations, as other detectors and descriptors do, such as SIFT or SURF. Being independent of image capturing makes the methodology scalable an accessible for many more situations.
Despite the independence of the laser scan system, the equipment used has some limitations. For example, it has not been designed to be able to climb stairs, although it can go through small bumps such as planks. In addition to the TLS for data acquisition, the UGV is equipped with two laser scanners SICK S300 Expert integrated at the front and the back of the UGV that allow to detect objects and holes. The combination of this system with the path planning makes it totally autonomous. The path planning incorporates an algorithm to recalculate the optimal route in those cases where the target point is inaccessible. This occurs frequently in building construction environments where materials or debris can appear unexpectedly. In order to minimize the traffic of personnel, the data acquisition can be carried out at night, since the laser system is an active system and does not need light to operate.
Regarding future works, the first research line will improve the coarse registration for those cases with a considerable number of outliers. Furthermore, the recognition of artificial targets could be included to improve those cases where the coarse registration is not good enough. The second research line will focus on the automatic extraction of the semantics of the different construction elements from the point cloud and either the classification of the elements or the conversion of the as-built model to a surface 3D model or a BIM. The use of BIM as a reference was possible for this work, but usually the BIM model is not available, in such way that the natural evolution of the as-built model generated automatically with the methodology proposed is the generation of the BIM. Funding: Authors would like to thank the Junta de Castilla y León and the Fondo Social Europeo for the financial support given through programs for human resources (EDU/1100/2017).

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.