Optimising Construction Site Auditing: A Novel Methodology Integrating Ground Drones and Building Information Modelling (BIM) Analysis

Guerrero-Sevilla, Diego; Rodríguez-Gómez, Rocío; Morcillo-Sanz, Alberto; Gonzalez-Aguilera, Diego

doi:10.3390/drones9040277

Open AccessArticle

Optimising Construction Site Auditing: A Novel Methodology Integrating Ground Drones and Building Information Modelling (BIM) Analysis

by

Diego Guerrero-Sevilla

¹

,

Rocío Rodríguez-Gómez

²

,

Alberto Morcillo-Sanz

¹

and

Diego Gonzalez-Aguilera

^1,*

¹

Cartographic and Land Engineering Department, Higher Polytechnic School of Avila, Universidad de Salamanca, 37008 Salamanca, Spain

²

Construction and Agronomy Department, Higher Polytechnic School of Zamora, Universidad de Salamanca, 37008 Salamanca, Spain

^*

Author to whom correspondence should be addressed.

Drones 2025, 9(4), 277; https://doi.org/10.3390/drones9040277

Submission received: 3 February 2025 / Revised: 24 March 2025 / Accepted: 2 April 2025 / Published: 4 April 2025

(This article belongs to the Special Issue Digital Twins and Extended Reality: Opportunities and Challenges of Integrated Applications (2nd Edition))

Download

Browse Figures

Versions Notes

Abstract

Monitoring and management of construction sites are critical to ensuring project success, efficiency, and safety. Traditional methods often struggle to provide real-time, accurate, and comprehensive data, leading to delays, cost overruns, and errors. This paper presents a novel methodology utilising a ground drone for auditing construction sites to detect changes and deviations from planned Building Information Modelling (BIM). The methodology focuses on developing a novel tool that facilitates Scan-vs-BIM auditing through time. Experimental results are presented, demonstrating the effectiveness and accuracy of the proposed methodology for assessing structural discrepancies. This research contributes to advancing construction auditing practices by integrating state-of-the-art technologies and innovative techniques, ultimately enhancing project monitoring and management processes in the construction industry.

Keywords:

ground drone; Scan-vs-BIM; automation in construction; artificial intelligence; software development

1. Introduction

Effective management of construction sites plays a crucial role in urban development, economic growth, and overall quality of life. Sustained and systematic maintenance of buildings is essential for ensuring structural safety, promoting sustainability, and improving long-term economic efficiency. As a result, the documentation and archival of construction and maintenance activities are imperative for accurately assessing structural conditions, identifying necessary repairs, and responding in a timely manner.

However, the prevalent approach for managing information related to building deterioration, maintenance history, and inspections has primarily been paper-based, manual, and inefficient [1]. Such methods pose risks of data corruption and loss during storage or handover, compromising management efficiency and accuracy. Furthermore, as construction projects become more complex and urban areas continue to expand, the volume of information that must be documented is increasing, leading to challenges such as the physical accumulation of paper records. This outdated approach to large-scale information management hinders quick decision-making and precise analysis due to difficulties in retrieving critical data when needed.

The increasing complexity of modern building construction projects has driven the need for more efficient data management systems. Traditional methods of record-keeping, such as paper logs or static spreadsheets, often fail to provide the level of organisation, accessibility, and analytical capability required for large-scale projects. As digital technologies continue to advance, construction professionals are turning to software solutions that enable better storage, tracking, and analysis of critical project data.

Automated spreadsheets with scripts or macros developed have allowed for the automation of some tasks [2]. However, despite its flexibility, spreadsheets have inherent limitations when it comes to handling complex, interconnected data structures. Furthermore, construction projects involve multiple variables, such as structural integrity assessments, material performance data, and maintenance histories, which require a more dynamic and spatially aware approach to data visualisation. In fact, one of the key drawbacks of using spreadsheets in construction management is their inability to effectively represent spatial relationships. For instance, tracking the deterioration of structural components over time, identifying wear patterns in critical areas, or linking inspection reports to specific building elements can become cumbersome in a traditional spreadsheet format.

These challenges highlight the importance of adopting more specialised digital solutions tailored to construction projects, enabling better integration of spatial and analytical data for improved decision-making and project efficiency.

The construction industry is rapidly adopting advanced digital solutions to improve project efficiency and information management. One of the most transformative innovations in this field is Building Information Modelling (BIM), a technology that revolutionises how construction data are structured, visualised, and utilised. Unlike conventional methods that depend on static 2D blueprints and written specifications, BIM creates a dynamic 3D representation of a building, integrating both its physical attributes and functional components. This allows professionals to comprehensively assess various aspects of a structure, including materials, mechanical systems, and architectural design, within a highly interactive model [3]. Beyond its role in the design and construction phases, BIM also enhances long-term facility management by consolidating essential data, such as maintenance records, structural assessments, and lifecycle predictions. The ability to centralise and visualise this information in a detailed digital model improves decision-making, reduces errors, and facilitates more effective coordination among project stakeholders. As a result, BIM is increasingly recognised as an indispensable tool for optimising both the construction and post-construction management of buildings, ensuring greater accuracy, efficiency, and sustainability throughout a structure’s lifespan.

However, the major challenge in applying the BIM methodology lies in its adaptation to real-world conditions. The models generated (as-built) often exhibit inconsistencies and become prone to errors as inevitable changes occur throughout the construction process [4,5]. The key to addressing this issue is a mechanism capable of synchronising BIM with the actual state of the project in real time, ensuring accuracy and reliability in decision-making.

2. Related Works

The Scan-vs-BIM methodology has emerged as a crucial approach in construction sites, helping to minimise discrepancies between the as-built structure and its original design within a BIM framework [6]. By comparing real-world construction data with the idealised digital model, this technique enables early detection of deviations, reducing costly errors and rework [7]. Studies have demonstrated that implementing this methodology provides significant financial benefits for engineering firms by improving accuracy and efficiency in construction quality control. To better understand the demands of the construction sector in this area, extensive research has been conducted, highlighting key challenges associated with this approach [8]. One of the primary difficulties lies in the accurate collection and processing of real-world data [9], which often contains inconsistencies such as sensor noise, misalignment, and georeferencing inaccuracies. Regarding data collection, the solution of Scan-vs-BIM involves the use of advanced geospatial technologies, including the generation of 3D point clouds from digital cameras [10] or terrestrial laser scanners (TLS) [11,12]. However, the effectiveness of the process for updating real data is often costly and sometimes inefficient (e.g., a lot of scan stations or images are required), which provides an error propagation and thus a loss of accuracy.

The need to quantify Scan-vs-BIM discrepancies has become the subject of numerous investigations [13,14,15,16]. As a result, various methods have been proposed to refine this Scan-vs-BIM methodology and enhance its reliability. Some researchers focus on improving data acquisition through image mapping, some of which includes images used for metric quality control of structural elements [17,18,19,20], while others try to improve TLS data acquisition, accurately collecting point clouds and comparing them with existing BIMs [21,22,23] during each phase of the building life cycle. In fact, the effectiveness of combining TLS and BIM has been demonstrated for preserving unconventional heritage structures, such as bridges [24,25], as well as for predicting the accuracy of the erection of pre-fabricated buildings [26], underground mining [27], building maintenance [28]. The combination of TLS and BIM provides many known advantages over other possible methodologies for the inspection of newly constructed buildings, including obtaining object data at a high level of detail with various types of information [29].

Regarding Scan-vs-BIM data processing, other authors have focused on automating basic geometries for comparison with BIM. For instance, in [30], authors developed an automated Scan-vs-BIM methodology for geometry detection and BIM updating of steel framing, significantly improving efficiency in point cloud processing and component alignment. The innovation lies in the use of genetic algorithms for instance segmentation, enabling precise axis extraction and pose transformation updates in IFC-based BIM models. However, a key limitation is that the methodology primarily updates component positioning and does not account for complex bending or twisting deformations in the BIM model, limiting its applicability to scenarios where structural elements undergo significant distortions. Other authors [31], introduced a Scan-vs-BIM methodology to automate the identification and quantification of dismantling elements in a nuclear power plant, significantly reducing errors caused by outdated drawings and human estimations. However, the approach still relies on manual verification and classification during BIM reconstruction, which may introduce human errors and inconsistencies, limiting full automation and scalability for large-scale projects.

Apart from the advances in Scan-vs-BIM data processing remarked above, one of the biggest challenges in Scan-vs-BIM is focused on passing from point clouds to BIM models in an automatic or semi-automatic way. Some recent works reported several contributions in this domain. For instance, Yunping et al. [32] introduced an automated Scan-to-BIM framework that reconstructs BIM models from defective point clouds and enables bidirectional integration between BIM and structural simulations of bridges. However, its accuracy remains sensitive to input data quality, lacking a detailed representation of complex architectural features and structural defects, which limits its applicability to intricate structures and predictive maintenance. For its part, Mun et al. [33] developed an image-based Scan-to-BIM approach that integrates photogrammetry, instance segmentation, and camera projection to enhance interior building component reconstruction with centimetre-level accuracy using widely available devices. However, its reliance on image-based point clouds makes it sensitive to occlusions, reflections, and missing data, while its assumptions of planarity and rectangularity limit its applicability to irregular or highly complex indoor environments. Trying to deal with complex geometries and structures, Garcia-Gago et al. [34] dealt with the Historical Building Information Modelling (HBIM) framework designed to enhance the diagnosis and conservation of historical buildings, effectively addressing the challenge of complex geometries in heritage structures. By integrating geometric and radiometric data from point clouds, material analysis, and non-destructive testing, the methodology enables a highly detailed and structured BIM environment capable of representing intricate architectural elements and structural deformations.

Building on the advancements described in the state of the art, this research establishes a concrete Scan-vs-BIM methodology that introduces key improvements across two fundamental areas: (i) in situ data acquisition and (ii) data processing. While previous approaches have relied on manual or semi-automated scan data processing, this study aims to enhance both automation and accuracy by integrating cutting-edge technologies through the workflow. A central innovation of this methodology is the use of a ground drone equipped with a TLS to optimise data acquisition. Unlike conventional static scanning methods, which can be limited in complex construction environments, this approach enables systematic, high-precision capture of building conditions. By improving accessibility and scan coverage, it ensures more accurate documentation of intricate geometries.

Beyond acquisition, this study advances the data processing phase of Scan-vs-BIM workflows. It introduces automated filtering techniques for noise reduction, AI-driven semantic classification, and an automatic 3D alignment method, significantly reducing the need for manual intervention. These improvements enhance the efficiency of segmenting, analysing, and processing point cloud data, ultimately streamlining the entire evaluation process. To translate these advancements into actionable insights, the methodology integrates results directly within BIM Collaborate Pro software, enabling automated model comparisons (Scan-vs-BIM). This integration facilitates precise structural discrepancy detection, enhancing quality control, deviation analysis, and automated decision-making in construction monitoring.

3. Materials and Methods

The development of a novel methodology integrating a terrestrial drone with BIM analysis is outlined in Figure 1. The first stage involves designing the terrestrial drone, along with its modular structure and interactions required to acquire the data. Once the data are collected, a second stage of data processing is carried out. In this stage, three algorithms have been developed: anisotropic filtering for noise removal, automatic 3D alignment, and semantic classification.

The processing stage generates the as-built model, commonly referred to as the point cloud model. Subsequently, the final comparison process between the as-built point cloud and the BIM (Scan-vs-BIM) is conducted using proprietary software. This software simplifies the application of the methodology and enables the detection of changes as well as the evaluation of the building works’ progress.

3.1. Ground Drone

To carry out the data acquisition and the subsequent generation of the point cloud, a commercial ground drone platform (Robotnik©) is used, Figure 2, which is capable of integrating the equipment used, consisting of a control PC, that runs the software necessary for the navigation and mapping, a safety laser scanner, the TLS technology, as well as the wiring for its power supply. In addition, the ground drone features a rotation capability that facilitates navigation and improves access to capture points, which are pre-calculated by the data acquisition assistant [35]. Various mechanical and electrical modifications are made to this commercial platform to house the different elements that must be on board the ground drone, allowing the capture of points in a safe manner. Figure 2 below shows the ground drone used to carry out the research.

The mobile robotic platform, model Guardian, is manufactured by the company Robotnik. It is equipped with wheels and tracks for movement, providing robustness on uneven terrain. The characteristics of this platform are described in Table 1.

The geometric configuration of the ground drone is of particular importance to ensure accurate data collection as it directly affects the field of view of the TLS. The TLS is located at approximately 0.975 m above the ground to avoid acquiring points from the robotic platform itself, which would constitute noise in the final point cloud, and an angle of 45° concerning the vertical axis of the system is ignored during the acquisition process. In addition to the field of view, the TLS used in this research, Faro Focus 3D x330, allows the configuration of other acquisition parameters such as resolution, quality, and image acquisition. These parameters are also closely related to the acquisition time and the volume of data generated, which is of vital importance not only in mission planning but also in the development of data processing and management. A final parameter to be considered for the optimal location of the TLS stations is the measurement range of the laser scanner, which in this case varies between 0.6 m and 330 m. As the acquisition is performed inside buildings, the measurement range will be limited to 10 m, thus avoiding the capture of distant points, which may constitute noise in the final point cloud. The laser scanner technology integrated into the ground drone is designed to ensure the safety of people during the data collection stage and to accurately localise the working environment concerning the provided reference map. Additionally, two safety lasers have been implemented into the ground drone: one at the back and the other at the front in order to cover the entire construction environment. The TLS enables the capture of three-dimensional data for each area of the building, resulting in the generation of the primary point cloud. For its implementation on the ground drone platform, a mechanical support is manufactured to place the TLS 1 m above the ground vertically in the centre of the ground drone, obtaining a field of vision of 270° vertically and 360° horizontally. The assembled ground drone with all the sensors listed can be seen in Figure 3 below.

The main characteristics of the lasers and cameras selected to be part of the ground drone, which have been considered in the geometric and structural configuration of the drone, are summarised below in Table 2.

3.2. Noise Filtering: Anisotropic Filtering

The initial stages of point cloud preprocessing involve a filtering process designed to reduce noise in the point clouds. One of the most widely used filtering methods is the Gaussian [36,37], which is based on the heat equation. However, its application has fallen into disuse due to its tendency to reduce point cloud sharpness, often leading to the loss of important details that may be critical in the context of point clouds. Trying to improve this important step in point cloud preprocessing, ref. [38] proposed a diffusion model, which they called anisotropic, based on the nonlinear variant of the heat equation, from which a filter can be constructed that minimises such loss of detail while correcting the image noise.

A particular adaptation of the anisotropic filtering method is used in this research but in a 3D domain. This anisotropic filtering technique aims to reduce the noise of 3D point clouds without losing relevant information. It is based on an iterative process of local spatial diffusion, where the points are moved considering the density of the point cloud.

3.2.1. Density Scalar Field

For the creation of an adaptative anisotropic filtering algorithm, the points in less-populated areas are adjusted to be closer to more populated areas and thus to smooth the point cloud. To do this, a scalar field of densities

ϕ

is needed. But first, the domain of the function,

Ω \subset R^{3}

, must be defined. For this purpose, a 3D grid is created, containing the point cloud. The size of the voxel can be estimated depending on the distances of the K-neighbours (K-NN) of the point cloud. As a result, the scalar field

ϕ (x, y, z; t),

which associates a density to each voxel of the grid, can be found by solving the anisotropic diffusion equation (Equation (1)):

\frac{\partial ϕ}{\partial t} = \nabla \cdot [c (‖\nabla ϕ‖) ∆ ϕ] = \nabla c \cdot \nabla ϕ + c (‖\nabla ϕ‖) ∆ ϕ

(1)

where

\nabla \cdot

,

\nabla,

and

∆

are the divergence, gradient, and Laplacian operators, respectively,

c

is the diffusion coefficient, which depends on the norm of the gradient of

ϕ

and which controls the rate of diffusion, and the parameter

k

controls the sensitivity to edges or sharpness. It is defined as (Equation (2)):

c (‖\nabla ϕ‖) = e^{- {(\frac{‖\nabla ϕ‖}{k})}^{2}}

(2)

For solving the previous equation, the following boundary and initial conditions are set:

It is supposed to $ϕ (i, j, k; t) = 0$ in ∂Ω.
It is supposed to $ϕ (i, j, k; 0) = | v_{i j k} |$ , where $| v_{i j k} |$ represents the number of points inside the voxel $v_{i j k}$ .

For approximating the spatial derivatives, the finite difference scheme is used. It is also possible to use the forward and backward differences in the boundary, instead of setting

ϕ (i, j, k; t) = 0

.

As

\frac{\partial ϕ}{\partial t} \approx \frac{ϕ^{t + 1} - ϕ^{t}}{∆ t}

, the anisotropic diffusion equation can be rewritten as follows (Equation (3)):

ϕ_{i j k}^{t + 1} = ϕ_{i j k}^{t} + ∆ t {(\nabla_{h} c \cdot \nabla_{h} ϕ + c ∆_{h} ϕ)}_{i j k}^{t}

(3)

where

\nabla_{h}

and

∆_{h}

are the approximations of the gradient and Laplacian using finite differences. And the parameter

∆ t

is the increment of time for each iteration.

3.2.2. Point Approximation

Moving points from less-populated areas (noisy points) towards most-populated areas (non-noisy points), based on the diffusion Equation (1), it is possible to smooth the scalar field

ϕ

and remove abrupt changes of density without losing information. This ensures that noisy points are adjusted precisely, avoiding excessively small or large steps. This translation is computed using the gradient ascent equation. Since the direction of the gradient of ϕ corresponds to the direction of a local maximum, it inherently points toward regions where data points tend to be less noisy. This is given by the following expression (Equation (4)):

p_{i j k}^{t + 1} = p_{i j k}^{t} + γ_{i j k}^{t} \nabla_{h} ϕ_{i j k}^{t}

(4)

where

p_{i j k}^{t}

are the points inside the voxel

v_{i j k}

and

γ_{i j k}^{t}

is the step size, which can be estimated using a Gaussian function depending on the density mean and standard deviation (Equation (5)).

γ_{i j k}^{t} = \frac{1}{σ^{t} \sqrt{2 π}} e x p [- \frac{{(m a x (ϕ_{i j k}^{t}, μ^{t}) - μ^{t})}^{2}}{2 {(σ^{t})}^{2}}]

(5)

where

μ^{t}

is the mean and

σ^{t}

the standard deviation of densities (Equations (6) and (7)):

μ^{t} = \frac{1}{| v_{i j k} |} \sum_{i j k} ϕ_{i j k}^{t}

(6)

σ^{t} = \sqrt{\frac{1}{|v_{i j k}| - 1} \sum_{i j k} {(ϕ_{i j k}^{t} - μ^{t})}^{2}}

(7)

Considering

m a x (ϕ_{i j k}^{t}, μ^{t})

inside the Gaussian ensures that the areas with the less density always have the maximum step size when moving its points.

In summary, for each iteration of the algorithm, the smoothed densities are computed, and the points are approximated based on these densities. In this manner, the directions in which the points move are updated with each iteration, resulting in a smoothed point cloud.

3.3. Automatic 3D Alignment

Considering that our ground drone does not use SLAM and it is based on a Stop&Go data acquisition, the point clouds acquired, although well levelled (i.e., preserve the vertical direction), need to be aligned in the same local coordinate system. To this end, the point cloud alignment is performed following a twofold process: (i) first, a ‘coarse alignment’ or first approximation is carried out based on a 3D detector and descriptor; and (ii) subsequently, a ‘fine-alignment’ of this transformation is refined by an iterative method until the minimum distance between overlapping areas is found. Both alignments make use of six-parameter rigid-solid transformations (i.e., three rotations and three translations).

In particular, the proposed automatic 3D alignment methodology employs the Harris 3D detector [39] alongside the point feature histogram (PFH) descriptor [40]. To refine the alignment, an Iterative Closest Point (ICP) algorithm is applied [41].

The Harris 3D detector yields favourable results in construction scenarios due to its suitability for identifying points on pillars, vertical walls, and ceilings by leveraging corners and edges. It operates efficiently, detecting large sets of interest points with good correspondence. The Harris 3D detector offers several variations, depending on how the trace and determinant (det) of the covariance matrix (Cov) are evaluated. In all variations, the response of the key points r(x,y,z) is computed, but distinct criteria are applied in each case. In our case, the Harris 3D Noble detector is applied (Equation (8)).

r (x, y, z) = \frac{\det (C o v (x, y, z))}{t r a c e (C o v (x, y, z))}

(8)

The Harris 3D detector is chosen for its focus on detecting corners and edges, while the PFH descriptor was selected for its robustness to viewpoint changes and ability to provide reliable information between point clouds. This makes it ideal for construction scenarios with varying viewpoints, lighting changes, and good scan overlap.

The 3D PFH descriptor characterises each point based on its neighbours by generalizing normal vectors (n) and mean curvature (α, ϕ, θ). These four parameters create a unique invariant signature (i.e., descriptor) for every point. For two points, p and q, a reference system of three-unit vectors (u, v, and w) is established. The difference between the normal vectors at p (n_p) and q (n_q) is then described using the normal vectors and three angular variables (Equations (9) and (11)), with d representing the distance between p and q.

α = a r c \cos (v \cdot n_{q})

(9)

\emptyset = a r c \cos (u \cdot \frac{p - q}{d})

(10)

θ = a r c \tan (w \cdot n_{p}, u \cdot n_{p})

(11)

being

v = u \times \frac{p - q}{d}; w = u \times v; d = {| |p - q| |}_{2}

The 3D PFH descriptor is applied to all feature points identified by the Harris 3D Noble detector, associating each point with a description. These descriptions are used to find correspondences between points from different point clouds. Identifying corresponding points enables the calculation of the transformation needed to align the point clouds within the same local coordinate reference system.

After computing the Harris 3D detector and PFH descriptor and applying an initial coarse alignment to all the point clouds, some are correctly aligned while others are not. This misalignment occurs due to similarities between scans, leading to imperfections. Despite this, the point clouds are brought close to their real positions, allowing for the creation of a single unified 3D point cloud through a final alignment process that corrects these errors. The final adjustment is performed using the ICP algorithm, an iterative method that minimises the distance error between point clouds until perfect alignment is achieved. Each iteration involves three steps: (i) identifying pairs of corresponding points, (ii) calculating the transformation that minimises the distance between these points, and (iii) applying the transformation to the point clouds. Because ICP is iterative, a good initial approximation is crucial for convergence. This initial estimate is provided by the coarse alignment, which approximates the point clouds’ positions effectively.

The main advantage of the two-step Harris 3D/PFH and point-to-point ICP registration is its fully automated nature. It eliminates the need for user interaction or the placement of targets in the scene as the method relies on naturally occurring points of interest within the environment.

3.4. Semantic Classification

The process of point cloud semantic classification is essentially an automatic process of segmenting the point cloud based on geometric and radiometric features and statistical rules rather than explicit supervised learning or prior knowledge [42]. This task can be performed using machine learning [43,44] or more specifically, within the subfield of deep learning [45,46], given the volume of data extracted from common point clouds. Current research on point cloud segmentation using deep learning methodologies focuses on weakly supervised semantic segmentation, domain adaptation in semantic segmentation, semantic segmentation based on multimodal data fusion, and real-time semantic segmentation [47].

The methodology employed in the semantic classification of the point clouds obtained in this research is focused on the use of deep learning and especially in the combination of geometric, radiometric, and topological features. The process, which has yielded satisfactory results in comparing the classification of point cloud structures with their equivalent BIMs, is detailed below.

3.4.1. Neural Network Architecture

The proposed neural network architecture for 3D point cloud classification is designed to take advantage of the radiometric, geometric, and topological features of each point (Figure 4). These features constitute the input to the model and play a crucial role in capturing the underlying structure and context within the point cloud. In the following, the architecture is detailed, with special emphasis on the multi-head attention mechanism, which is fundamental for modelling the complex interactions between the points.

The model input consists of feature vectors associated with each point in the cloud. Each vector includes radiometric (colour, intensity, and/or reflectance), geometric (geometric features extracted from the covariance matrix), and topological (distance to the plane, height above, and height below) information. These multidimensional features are essential to fully describe the nature of the point and its neighbourhood.

At the core of the architecture, there is a multi-head attention layer (Figure 4), which allows the model to learn both local and global relationships between points in the cloud. This mechanism operates by assigning adaptive weights to interactions between points, allowing the network to identify relevant patterns and key relationships in the data. This layer is implemented by dividing the internal feature representations into multiple subspaces (i.e., attention heads), each focusing on different aspects of the interactions. This ensures that the model captures a diversity of spatial and contextual relationships. By combining the outputs of all heads, the model achieves a rich and robust representation of the dependencies between points.

After multi-head attention, a layer normalisation is applied to stabilise the training and improve the convergence of the model. The processed features are flattened to connect them to the subsequent dense layers.

The final part of the model consists of several fully connected dense layers. These layers are designed to perform classification based on the processed features. To mitigate overfitting, regularisation techniques such as L2 penalty and dropout are employed, which improve the generalisation of the model.

All the outputs of each layer are evaluated in the rectified linear unit (ReLU) function (Equation (12)):

R e L U (z) = R (z) = m a x (0, z)

(12)

In simple terms, the softmax function applies the exponential function to each element z_i of the input vector z. It then normalises these values by dividing each exponential by the sum of all exponentials in the vector. This normalisation ensures that the components of the output vector

{σ (z)}_{i}

sum to 1.

In point cloud classification problems, the Categorical Cross Entropy loss function is commonly used due to its effectiveness in measuring the difference between the actual class distribution and the model’s predictions. It is rooted in the concepts of entropy and probability, making it well suited for this type of task. Then, the Categorical Cross Entropy function is defined as Equation (13):

L = - \sum_{i} y_{i} l o g ({\hat{y}}_{i})

(13)

where

y_{i}

represents the actual class label, and

{\hat{y}}_{i}

denotes the prediction (probability between 0 and 1).

The training consists of two phases, forward propagation and back propagation. Forward propagation is based on everything defined above, i.e., the output z is calculated, evaluated in the activation function, and, finally, with the output of the last layer, the loss function L is calculated. Back propagation is the weight adjustment process that minimises the loss function. It is applied after forward propagation. For this purpose, the following Gradient Descent method is used:

The gradient of L concerning the weights w and biases b is calculated (Equation (14)).

\nabla L = (\frac{\partial L}{\partial w}, \frac{\partial L}{\partial b}) = - (\frac{\partial}{\partial w} \sum_{i} y_{i} \log ({\hat{y}}_{i}), \frac{\partial}{\partial b} \sum_{i} y_{i} \log ({\hat{y}}_{i}))

(14)

The weights and biases are updated using the negative gradient multiplied by the learning rate γ (Equations (15) and (16)):

w_{t + 1} = w_{t} - γ \frac{\partial L}{\partial w}

(15)

b_{t + 1} = b_{t} - γ \frac{\partial L}{\partial b}

(16)

The process of forward propagation and back propagation is repeated several times (epochs) until the loss function L is minimised. After completing the training process, the neural network’s neurons and weights are adjusted to minimise the output error (i.e., the loss function is minimised). Once trained, it is sufficient to input the geometric, radiometric, and topological features of the points to be classified into the input layer and perform forward propagation again. The classification is then determined based on the predictions generated by the output layer.

3.4.2. Geometric and Topological Features

The geometric features of a point are given by the eigenvalues and eigenvectors of the covariance matrix

Σ_{N}

of its neighbourhood

N (E q u a t i o n (17))

:

Σ_{N} = (\begin{matrix} S_{x x} & S_{x y} & S_{x z} \\ S_{y x} & S_{y y} & S_{y z} \\ S_{z x} & S_{z y} & S_{z z} \end{matrix}) S_{x y} = \frac{1}{n - 1} \sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y}) \bar{x} = \frac{1}{n} \sum_{i = 1}^{n} x_{i}

(17)

where

x = \{x_{1}, x_{2}, \dots, x_{n}\}, y = \{y_{1}, y_{2}, \dots, y_{n}\}, z = \{z_{1}, z_{2}, \dots, z_{n}\}

are the x, y and z coordinates of each point of the neighbourhood

N

.

The geometric, radiometric, and topological features analysed are described in Table 3. Similar to geometric features, topological features describe the relationship between an observation and a reference. To calculate these features, a reference plane that best fits the surface must first be determined. For this purpose, the RAndom SAmple Consesus (RANSAC) algorithm has been used. RANSAC is an iterative method used to estimate the parameters of a mathematical model—such as the equation of a plane in this case—from a dataset that includes outliers.

4. Results and Discussion

As mentioned earlier, the integration of TLS and BIM has been extensively validated for site inspection and maintenance. However, ongoing research is focused on developing a methodology for the automated processing of point cloud data prior to its comparison with the BIM. The following results demonstrate the Scan-vs-BIM methodology developed, enabling the assessment of structural discrepancies in building construction. The outcomes of the proposed methodology, applied to a real case study, are presented below.

The proposed methodology was validated by applying it to two distinct scenarios within a residential building located in Badalona, Spain. The building features an asymmetrical U-shaped floor plan (Figure 5) and was under construction at the time of data acquisition, with a projected lifespan of 50 years. Covering an area of 25,000 m², the structure included two basement levels designated for parking, a ground floor intended for commercial offices, and upper levels allocated for residential housing. The building was divided into two sections: the north block, comprising 14 floors, and the south block, with 13 floors.

During data acquisition, the building was at various stages of construction, providing an opportunity to test the proposed methodology in diverse environments and varying levels of complexity. Three specific floors were chosen for the data collection. These floors were selected for their construction state and geometric complexity, making them representative of typical construction scenarios. The scenes included walls, pillars, floors, and ceilings. Additionally, the first floor included several metal props commonly used for shoring structures.

Data acquisition was conducted using the ground drone described in Section 3.1. For the first floor, 11 stations were required using the Stop&Go strategy, the second case study needed 7 stations to cover all areas, while the third floor required 8 stations. The first floor required more stations due to its construction stage (i.e., with several metal props), which resulted in more occlusions.

To balance data quality and acquisition efficiency, a 60% overlap between scans was set, with a scanning resolution of 4 mm at 20 m. This overlap was chosen to optimise the precision and reliability of the automatic alignment process. Additionally, points within 0.8 m of the ground drone (minimum acquisition range: 0.6 m) were excluded to ensure accurate data collection

Each scan position required approximately 3 min, plus additional time for movement between stops. The total acquisition time was 45 min for the first floor, 25 min for the second floor, and 30 min for the third floor.

To ensure a clean and consistent dataset for the next steps, the raw point clouds were pre-processed with the anisotropic filtering as described in Section 3.2. The following input parameters were considered in the anisotropic filtering:

∆ t = 400, k = 30, i t e r a t i o n s = 200

, resulting in a noise-free point cloud (Figure 6).

Analysing Figure 6 and the rest of the point clouds filtered, it seems clear that the application of anisotropic filtering significantly improved the quality of the point clouds by effectively reducing noise while preserving critical geometric features such as edges and corners. This enhancement is particularly evident in areas with high geometric complexity, where traditional filtering methods often blur or distort fine details. The resulting point clouds exhibit smoother transitions in homogeneous regions and sharper delineation in regions with abrupt changes, such as walls and structural joints. These improvements facilitate more accurate subsequent processing steps, such as the alignment and semantic classification, underscoring the suitability of anisotropic filtering for construction site data with varying levels of detail and noise.

After the noise removal, point clouds were automatically aligned (Figure 7). The automatic alignment procedure outlined in Section 3.3 was applied separately to the different scans of each floor. The procedure included the following: (i) coarse registration, using the Harris 3D Noble detector and PFH descriptor; (ii) fine registration, using the ICP algorithm for adjusting the point clouds with millimetric precision.

With a unique point cloud filtered and aligned per floor, a semantic classification was performed using the deep learning network adapted and trained to construction works and described in Section 3.4. A custom architecture (Figure 8) was selected over existing models to accommodate better the case studies on semantic point cloud classification of buildings under construction. Although predefined models have demonstrated strong performance in various applications, they may not always be optimally suited for capturing the unique geometric and structural patterns present in construction sites. Particularly, the flexibility of our library allows layers to be dynamically added or removed, optimising performance based on specific project requirements and data constraints. This adaptability is particularly valuable when dealing with highly dynamic construction environments, where standard models may require additional fine-tuning or modifications to achieve comparable performance. Additionally, integrating this architecture into a dedicated library ensures seamless compatibility with previous developments while also facilitating future enhancements and scalability without external constraints. This approach does not dismiss the potential of existing models but rather provides an alternative pathway for improving adaptability and efficiency in the specific context of Scan-vs-BIM applications.

Figure 9 and Figure 10 show the point cloud with the implementation of semantic classification in which the most relevant elements of the construction such as floors, ceilings, walls, and pillars, as well as metal props used on the first floor, can be distinguished.

The model was trained using the Categorical Cross Entropy loss function, which is well suited for semantic segmentation tasks. The Adam optimisation algorithm was employed due to its efficiency and adaptability in adjusting the model’s weights during training. Additionally, the Early Stopping technique was implemented to halt training if no improvement in model performance was observed after 10 consecutive epochs. In this case, the model retains the configuration from the best-performing epoch, i.e., the one that achieved the highest performance before stagnation. The following statistical metrics (Table 4) and graphics (Figure 11) were obtained.

This performance underscores the network’s capacity to leverage geometric and contextual features unique to construction projects. Additionally, the network’s efficiency in processing large-scale point clouds ensures its applicability to real-world scenarios, enabling improved monitoring and decision-making throughout the construction process. These results validate the potential of neural networks as a reliable tool for advancing automated construction site analysis.

The results of the semantic classification demonstrate the effectiveness of the custom-designed neural network in accurately identifying and categorising structural elements within point clouds of buildings under construction. To ensure a comprehensive evaluation, the dataset used in this study consists of 2.6 million points captured from two distinct construction scenarios, representing varying levels of complexity, occlusions, and environmental conditions. The dataset was split into training and testing subsets, with 75% allocated for training and 25% for testing, ensuring a balanced distribution across all structural classes.

The network’s ability to generalise across diverse construction environments highlights its robustness. Key structural components such as walls, pillars, and ceilings were classified with high precision, even in cluttered and partially occluded areas typical of construction sites. The analysis of class distribution within the dataset further confirms that the model maintains consistent performance across both frequent and less-represented categories, mitigating potential class imbalance issues.

Additionally, the network’s efficiency in processing large-scale point clouds ensures its applicability to real-world scenarios, facilitating improved monitoring, quality control, and decision-making throughout the construction process. These results validate the potential of neural networks as a reliable tool for advancing automated construction site analysis, with a structured dataset evaluation that supports the reliability and generalizability of the approach.

Once the point cloud data obtained by the ground drone was processed, it was compared with the theoretical BIM designed for this construction. The comparison of the two models, as-built vs. BIM, was carried out using Autodesk Revit 2024 software, which has modules that allow direct comparison of the two models. Figure 12 presents a comparison between the model obtained from the ground drone (as-built) and the BIM, focusing on the most relevant structural elements, specifically the concrete columns, to ensure the structural integrity of the building is properly assessed. The columns that make up the three-storey building were analysed quantitatively by measuring the discrepancies that exist between the two models in terms of the key geometry parameters of the columns. The comparison between the BIM and the as-built point cloud was conducted using specified criteria for translations, overhangs, levels, section variations, flatness, relative measurements, and voids. The established limits of the geometrical parameters are those that can be assumed so that there is no loss of compressive strength of the column that could endanger the structural integrity of the building. For the quantified parameters, the admissible limits have been applied to ensure the structural integrity and safety of the construction site, which are described below:

-: Translations from −24 mm to +12 mm: refer to the horizontal or vertical movement of the axis of the column from its design position.
-: Overhangs from −24 mm to +12 mm: refer to a change in the position of the upper part of the column relative to its base, when the upper part seems to ‘protrude’ or is displaced.
-: Levels from −20 mm to 20 mm: refer to vertical deviations of the column level (column elevation).
-: Section variation from −10 mm to 8 mm: refers to changes in the cross-sectional size of the column. For example, the section at the bottom of the column may differ from the section at the top of the column.
-: Flatness (3 m rule) from −12 mm to +12 mm: refers to how flat the surface of a column is along its entire length or height. The 3 m rule means that the measurement is made over a length of 3 m.
-: Relative measurements from −6 mm to +10 mm: refer to relative deviations that may pertain to the dimensional relationships between two or more columns or between different sections of the same column. For instance, this occurs when two columns are aligned at different levels with an overlap, yet they exhibit an offset, resulting in a misalignment where they do not coincide vertically.
-: Voids: refer to unexpected cavities in the column, which may result from construction or design errors.

The discrepancies obtained for the selected geometrical parameters are shown in Figure 13 and Figure 14. It can be seen from this table that there are small variations for most of the parameters, which are within the admissible limits.

The results of the comparison between the as-built point cloud data and the BIM demonstrate the efficacy of the proposed methodology in assessing the structural integrity of critical building components, particularly concrete columns. The discrepancies observed in geometric parameters, such as translations, overhangs, levels, and section variations, were within the established admissible limits, ensuring that the structural integrity of the building was not compromised. For instance, the detected translation discrepancy of 20 mm and the overhang of 20 mm were well within the tolerable ranges for structural safety. Similarly, variations in column cross-sections, flatness, and relative measurements were minor, reaffirming the accuracy of the construction relative to the design. While voids were observed in some columns, their presence highlights potential areas for improvement in concreting processes rather than structural concerns. Overall, these findings validate the integration of ground drone inspections with BIM-based analysis as a robust tool for monitoring and ensuring the quality and safety of construction projects. Specifically, integrating georeferenced and semantically enriched point cloud data directly into the BIM workflow proves to be far more efficient than traditional methods that require converting point clouds into a BIM through complex reverse engineering processes. These conventional approaches are not only time-consuming but also prone to errors, making direct point cloud integration a more reliable and accurate solution for real-time construction assessment.

5. Conclusions

This study introduces a comprehensive methodology for processing point cloud data obtained through ground drone inspections and comparing it with BIMs to identify and address discrepancies during the construction process. The proposed approach has proven effective in detecting deviations, particularly in critical structural elements, such as concrete columns, ensuring a higher level of precision and reliability in construction monitoring. Regarding the novel methodology developed for assessing construction buildings, the following aspects could be highlighted:

(i): The integration of anisotropic filtering, automatic alignment, and semantic classification has proven to be a robust methodology for processing and analysing point cloud data in construction scenarios. Anisotropic filtering effectively reduces noise while preserving critical geometric features, enabling cleaner and more accurate data for subsequent processing. Automatic alignment, achieved through coarse and fine registration steps, ensures the precise positioning of point clouds within a unified coordinate system, minimizing manual intervention and errors. The incorporation of semantic classification further enhances the methodology by providing meaningful categorisation of structural elements in construction works, facilitating a detailed analysis and quality control. Together, these processes form a comprehensive pipeline that optimises point cloud processing, ensuring accurate model updates and contributing to the reliability of BIM-based construction audits.
(ii): The Scan-vs-BIM methodology developed successfully identified discrepancies between both models, such as translation, section variation, level differences, and others. These discrepancies, while not compromising the overall structural integrity, highlight areas for improvement to enhance the accuracy of the BIM and better reflect actual site conditions.
(iii): By applying this methodology across different construction phases, the study establishes a foundation for a digital twin approach that dynamically integrates real-time data with BIMs. This adaptability ensures that BIMs remain accurate throughout the construction cycle, addressing evolving site conditions and construction complexities. As a result, cost savings and efficiency can be provided to construction works, reducing the need for extensive manual inspections.

Future research can expand this methodology by incorporating a broader range of structural and non-structural elements, addressing challenges such as highly reflective surfaces, temporary site installations, and occluded components. A key direction will be the integration of multi-modal data sources, combining TLS with photogrammetry and thermal imaging to enhance point cloud classification accuracy and contextual awareness. Another critical area is the automation of corrective feedback within the Scan-vs-BIM workflow. Future developments should focus on real-time deviation analysis and rule-based decision support systems that can autonomously suggest corrections in BIM models. Implementing reinforcement learning mechanisms could allow the system to learn from past corrections, continuously refining classification accuracy and reducing false positives in anomaly detection.

Author Contributions

Conceptualisation, D.G.-S. and D.G.-A.; methodology, D.G.-A. and R.R.-G.; software, D.G.-S., D.G.-A., R.R.-G. and A.M.-S.; validation, D.G.-S., D.G.-A., R.R.-G. and A.M.-S.; investigation, D.G.-S., D.G.-A., R.R.-G. and A.M.-S.; resources, D.G.-A.; writing—original draft preparation, D.G.-A. and R.R.-G.; writing—review and editing, D.G.-S., D.G.-A., R.R.-G. and A.M.-S.; project administration, D.G.-A.; funding acquisition, D.G.-A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Brooks, T.J.; Lucas, J.D. A Study to Support BIM Turnover to Facility Managers for Use after Construction. In Proceedings of the 2014 International Conference on Computing in Civil and Building Engineering, Orlando, FL, USA, 23–25 June 2014; American Society of Civil Engineers: Reston, VA, USA, 2014; pp. 243–250. [Google Scholar]
Walkenbach, J. Excel 2010 Power Programming with VBA; John Wiley & Sons: Hoboken, NJ, USA, 2011; ISBN 978-1-118-25761-6. [Google Scholar]
Wu, W.; Issa, R.R.A. BIM-enabled building commissioning and handover. In Proceedings of the 2012 ASCE International Workshop on Computing in Civil Engineering, Clearwater Beach, FL, USA, 17–20 June 2012; American Society of Civil Engineers: Reston, VA, USA, 2012; pp. 237–244. [Google Scholar]
Naghshbandi, S.N. BIM for facility management: Challenges and research gaps. Civ. Eng. J. 2016, 2, 679–684. [Google Scholar] [CrossRef]
Chuang, T.Y.; Yang, M.J. Change component identification of BIM models for facility management based on time-variant BIMs or point clouds. Autom. Constr. 2023, 147, 104731. [Google Scholar] [CrossRef]
Bassier, M.; Vincke, S.; De Winter, H.; Vergauwen, M. Drift Invariant Metric Quality Control of Construction Sites Using BIM and Point Cloud Data. ISPRS Int. J. Geo-Inf. 2020, 9, 545. [Google Scholar] [CrossRef]
Kim, T.; Yoon, Y.; Lee, B.; Ham, N.; Kim, J.J. Cost-benefit analysis of scan-vs-BIM-based quality management. Buildings 2022, 12, 2052. [Google Scholar] [CrossRef]
Adekunle, S.A.; Aigbavboa, C.; Ejohwomu, O.A. Scan to BIM: A systematic literature review network analysis. IOP Conf. Ser. Mater. Sci. Eng. 2022, 1218, 012057. [Google Scholar] [CrossRef]
Bhatla, A.; Choe, S.Y.; Fierro, O.; Leite, F. Evaluation of accuracy of as-built 3D modeling from photos taken by handheld digital cameras. Autom. Constr. 2012, 28, 116–127. [Google Scholar] [CrossRef]
Kim, M.K.; Wang, Q.; Park, J.W.; Cheng, J.C.; Sohn, H.; Chang, C.C. Automated dimensional quality assurance of full-scale precast concrete elements using laser scanning and BIM. Autom. Constr. 2016, 72, 102–114. [Google Scholar] [CrossRef]
Becerik-Gerber, B.; Jazizadeh, F.; Li, N.; Calis, G. Application areas and data requirements for BIM-enabled facilities management. J. Constr. Eng. Manag. 2012, 138, 431–442. [Google Scholar] [CrossRef]
Kassem, M.; Kelly, G.; Dawood, N.; Serginson, M.; Lockley, S. BIM in facilities management applications: A case study of a large university complex. Built Environ. Proj. Asset Manag. 2015, 5, 261–277. [Google Scholar] [CrossRef]
Sadeghi, M.; Elliott, J.W.; Porro, N.; Strong, K. Developing building information models (BIM) for building handover, operation and maintenance. J. Facil. Manag. 2019, 17, 301–316. [Google Scholar] [CrossRef]
Bassier, M.; Yousefzadeh, M.; Vergauwen, M. Comparison of 2D and 3D wall reconstruction algorithms from point cloud data for as-built BIM. J. Inf. Technol. Constr. 2020, 25, 173–192. [Google Scholar] [CrossRef]
Kim, S.; Kim, S.; Lee, D.E. 3D point cloud and BIM-based reconstruction for evaluation of project by as-planned and as-built. Remote Sens. 2020, 12, 1457. [Google Scholar] [CrossRef]
Suhyun, K.; Seungho, K.; Sangyong, K. Improvement of the defect inspection process of deteriorated buildings with scan to BIM and image-based automatic defect classification. J. Build. Eng. 2025, 99, 111601. [Google Scholar]
Vincke, S.; Vergauwen, M. Vision based metric for quality control by comparing built reality to BIM. Autom. Constr. 2022, 144, 104581. [Google Scholar] [CrossRef]
Tuttas, S.; Braun, A.; Borrmann, A.; Stilla, U. Acquisition and consecutive registration of photogrammetric point clouds for construction progress monitoring using a 4D BIM. PFG–J. Photogramm. Remote Sens. Geoinf. Sci. 2017, 85, 3–15. [Google Scholar]
Lin, J.J.; Han, K.K.; Golparvar-Fard, M. A framework for model-driven acquisition and analytics of visual data using UAVs for automated construction progress monitoring. In Proceedings of the 2015 International Workshop on Computing in Civil Engineering, Austin, TX, USA, 21–23 June 2015; pp. 156–164. [Google Scholar]
Xue, J.; Hou, X.; Zeng, Y. Review of image-based 3D reconstruction of building for automated construction progress monitoring. Appl. Sci. 2021, 11, 7840. [Google Scholar] [CrossRef]
Liu, J.; Xu, D.; Hyyppä, J.; Liang, Y. A survey of applications with combined BIM and 3D laser scanning in the life cycle of buildings. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 5627–5637. [Google Scholar] [CrossRef]
Wang, Q.; Tan, Y.; Mei, Z. Computational methods of acquisition and processing of 3D point cloud data for construction applications. Arch. Comput. Methods Eng. 2020, 27, 479–499. [Google Scholar]
Otero, R.; Lagüela, S.; Garrido, I.; Arias, P. Mobile indoor mapping technologies: A review. Autom. Constr. 2020, 120, 103399. [Google Scholar] [CrossRef]
Willkens, D.S.; Liu, J.; Alathamneh, S.A. Case Study of Integrating Terrestrial Laser Scanning (TLS) and Building Information Modeling (BIM) in Heritage Bridge Documentation: The Edmund Pettus Bridge. Building 2024, 14, 1940. [Google Scholar]
Liu, J.; Li, B. Terrestrial Laser Scanning (TLS) Survey and Building Information Modeling (BIM) of The Edmund Pettus Bridge: A Case Study. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2024, 48, 379–386. [Google Scholar]
Huang, J.; Han, D.; Zhang, H.; Cui, W. Framework and case study for the assembly accuracy prediction of prefabricated buildings using BIM and TLS. Archit. Eng. Des. Manag. 2024, 20, 1427–1453. [Google Scholar]
Van Pham, C.; Cao, C.X.; Van Le, C.; Nguyen, L.Q.; Le, T.H.T.; Van Nguyen, T.; La, H.P. BIM and TLS Point Cloud Integration for Information Management of Underground Coal Mines: A Case Study in Nui Beo Underground Coal Mining in Vietnam. KSCE J. Civ. Eng. 2024, 28, 5829–5840. [Google Scholar]
Sing, M.C.; Luk, S.Y.; Chan, K.H.; Liu, H.J.; Humphrey, R. Scan-to-BIM technique in building maintenance projects: Practicing quantity take-off. Int. J. Build. Pathol. Adapt. 2022, 42, 1250–1262. [Google Scholar]
Borkowski, A.S.; Kubrat, A. Integration of Laser Scanning, Digital Photogrammetry and BIM Technology: A Review and Case Studies. Eng 2024, 5, 2395–2409. [Google Scholar] [CrossRef]
Lin, S.; Duan, L.; Jiang, B.; Liu, J.; Guo, H.; Zhao, J. Scan vs. BIM: Automated geometry detection and BIM updating of steel framing through laser scanning. Autom. Constr. 2025, 170, 105931. [Google Scholar]
Jaeseop, S.; Jeongwoog, S. Dismantling Quantity Estimation for Nuclear Power Plant: Scan-to-BIM versus Conventional Method. KSCE J. Civ. Eng. 2024, 28, 1607–1621. [Google Scholar]
Fang, Y.; Mitoulis, S.A.; Boddice, D.; Yu, J.; Ninic, J. Scan-to-BIM-to-Sim: Automated reconstruction of digital and simulation models from point clouds with applications on bridges. Engineering 2025, 25, 104289. [Google Scholar]
Wong, M.O.; Sun, Y.; Ying, H.; Yin, M.; Zhou, H.; Brilakis, I.; Kelly, T.; Lam, C.C. Image-based scan-to-BIM for interior building component reconstruction. Autom. Constr. 2025, 173, 106091. [Google Scholar]
Garcia-Gago, J.; Sánchez-Aparicio, L.J.; Soilán, M.; González-Aguilera, D. HBIM for supporting the diagnosis of historical buildings: Case study of the Master Gate of San Francisco in Portugal. Autom. Constr. 2022, 141, 104453. [Google Scholar]
González-de Santos, L.M.; Díaz-Vilariño, L.; Balado, J.; Martínez-Sánchez, J.; González-Jorge, H.; Sánchez-Rodríguez, A. Autonomous Point Cloud Acquisition of Unknown Indoor Scenes. ISPRS Int. J. Geo-Inf. 2018, 7, 250. [Google Scholar] [CrossRef]
Yi-de, M.; Fei, S.; Lian, L. Gaussian noise filter based on PCNN. In Proceedings of the International Conference on Neural Networks and Signal Processing, Nanjing, China, 14–17 December 2003; Volume 1, pp. 149–151. [Google Scholar]
Ma, Y.; Lin, D.; Zhang, B.; Liu, Q.; Gu, J. A novel algorithm of image Gaussian noise filtering based on PCNN time matrix. In Proceedings of the 2007 IEEE International Conference on Signal Processing and Communications, Dubai, United Arab Emirates, 1 November 2007; pp. 1499–1502. [Google Scholar]
Perona, P.; Malik, J. Scale-space and edge detection using anisotropic diffusion. IEEE Trans. Pattern Anal. Mach. Intell. 1990, 12, 629–639. [Google Scholar] [CrossRef]
Sipiran, I.; Bustos, B. Harris 3D: A robust extension of the Harris operator for interest point detection on 3D meshes. Vis. Comput. 2011, 27, 963–976. [Google Scholar] [CrossRef]
Holz, D.; Ichim, A.E.; Tombari, F.; Rusu, R.B.; Behnke, S. Registration with the point cloud library: A modular framework for aligning in 3-D. IEEE Robot. Autom. Mag. 2015, 22, 110–124. [Google Scholar] [CrossRef]
Rusu, R.B.; Blodow, N.; Marton, Z.C.; Beetz, M. Aligning point cloud views using persistent feature histograms. In Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, 22–26 September 2008; pp. 3384–3391. [Google Scholar]
Xie, Y.; Tian, J.; Zhu, X.X. Linking points with labels in 3D: A review of point cloud semantic segmentation. IEEE Geosci. Remote Sens. Mag. 2020, 8, 38–59. [Google Scholar] [CrossRef]
Xia, T.; Yang, J.; Chen, L. Automated semantic segmentation of bridge point cloud based on local descriptor and machine learning. Autom. Constr. 2022, 133, 103992. [Google Scholar] [CrossRef]
Zhou, J.; Fu, X.; Zhou, S.; Zhou, J.; Ye, H.; Nguyen, H.T. Automated segmentation of soybean plants from 3D point cloud using machine learning. Comput. Electron. Agric. 2019, 162, 143–153. [Google Scholar] [CrossRef]
Pierdicca, R.; Paolanti, M.; Matrone, F.; Martini, M.; Morbidoni, C.; Malinverni, E.S.; Lingua, A.M. Point cloud semantic segmentation using a deep learning framework for cultural heritage. Remote Sens. 2020, 12, 1005. [Google Scholar] [CrossRef]
Zhang, J.; Zhao, X.; Chen, Z.; Lu, Z. A review of deep learning-based semantic segmentation for point cloud. IEEE Access 2019, 7, 179118–179133. [Google Scholar] [CrossRef]
Mo, Y.; Wu, Y.; Yang, X.; Liu, F.; Liao, Y. Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing 2022, 493, 626–646. [Google Scholar] [CrossRef]

Figure 1. Diagram of the methodology developed for the comparison between as-built and BIM (Scan-vs-BIM).

Figure 2. Commercial ground drone platform, Guardian model (www.robotnik.es, accessed on 4 February 2025).

Figure 3. Final design of the ground drone with the different sensors onboard.

Figure 4. Neural network architecture developed for the semantic point cloud classification of buildings.

Figure 5. Building selected as case study: (a) general view of the building; (b) detailed view of the current stage of the construction.

Figure 6. Comparison of original point cloud (a) and pre-processed point cloud obtained after the application of anisotropic filtering (b).

Figure 7. Automatic registration procedure: (a) initial state of the point clouds after the coarse alignment based on Harris 3D Noble–PFH; (b) final result of the ICP alignment.

Figure 8. Custom neural network architecture for semantic classification of building point clouds. The model starts with an input layer (blue), followed by multiple multi-head attention blocks (orange) and dense layers (green), with dropout layers (grey) to enhance generalisation. The numerical values indicate the tensor dimensions at each stage, representing the number of features processed. The output layer (red) assigns a class label to each point, enabling semantic segmentation.

Figure 9. Semantic classification performed using the filtered and aligned point cloud as input data on the three selected plants.

Figure 10. Semantic classification performed using the filtered and aligned point cloud as input data on the whole building.

Figure 11. Training and validation performance metrics for the semantic classification of building point clouds. The graphs illustrate the evolution of loss (a), accuracy (b), mean IoU (c), recall (d), and F1-score (e) across training epochs for both training and validation sets. The model demonstrates effective learning, with loss decreasing and accuracy exceeding 98%. Mean IoU improves steadily, indicating better segmentation, while recall and F1-score remain high.

Figure 12. Comparison between the BIM (blue) and as-built model (grey point cloud) obtained.

Figure 13. Column 1 shows the discrepancies in (A) translation, (B) overhang, and (C) section.

Figure 14. Column 5 shows the discrepancies in (A) elevation, (B) flatness, (C) relative measurements, and (D) voids.

Table 1. Ground drone platform characteristics.

Dimensions	127 × 500 × 450 mm (tracks) 1127 × 747 × 554 mm (tracks + wheels)
Weight	120 kg
Load capacity	100 kg
Maximum speed	3 m/s
Controller	Open architecture in ROS (Robot Operating System)
Autonomy	from 3 to 10 h normal operation

Table 2. Characteristics of the primary lasers and cameras that constitute the ground drone.

FARO Focus 3D X330 Terrestrial Laser Scanner (Lake Mary, FL, USA)
Range	0.6–330 m
Measurement speed	up to 976,000 points/second
Range error	±2 mm
Integrated colour camera	up to 70 megapixels
Laser Class	Class 1
Weight	5.2 kg
Multi-Sensor	GPS, Compass, Altimeter, Dual Axis Compensator
Dimensions	240 × 200 × 100 mm
Control	via touch screen and WLAN
SICK S300 Safety Laser Scanner (Waldkirch, Germany)
Scanning range	270°
Angular resolution	0.5°
Response time	80 ms
Protective fields	up to 2 m or 3 m
Warning fields	up to 8 m
Field records	consisting of one protective field and up to two warning fields
Reset lock/parameterizable reset delay	Integrated
Switching of protective fields	Integrated
Cameras
High-resolution monochrome camera	4.92 Mpx, model IDS UI-5480CP-M-GL 1⁄2″ 2590 × 1920-14i/s-Mono-CMOS-GigE
Lower-resolution colour camera	2.12 Mpx, model IDS UI-5860CP-C-HQ 1/2.8″ Rev2 1936 × 1096-54i/s-colour CMOS-GE

Table 3. Geometric features analysed to obtain semantics.

Geometric Features
Sum of eigenvalues	$\sum_{i} λ_{i}$
Omni-variance	${(\prod_{i} λ_{i})}^{\frac{1}{3}}$
Eigen-entropy	$- \sum_{i} λ_{i} l n (λ_{i})$
Anisotropy	${(λ}_{1} - λ_{3}) / λ_{1}$
Linearity	${(λ}_{1} - λ_{2}) / λ_{1}$
Planarity	${(λ}_{2} - λ_{3}) / λ_{1}$
Sphericity	$λ_{1} / λ_{3}$
PCA1 (main component 1)	$λ_{1} {(\sum_{i} λ_{i})}^{- 1}$
PCA 2 (main component 2)	$λ_{2} {(\sum_{i} λ_{i})}^{- 1}$
Surface variation	$λ_{3} {(\sum_{i} λ_{i})}^{- 1}$
Verticality	$1 - \|n_{z}\|$ $w h e r e n_{z} = v_{3} \cdot (0,0, 1)$
Eigenvalue 1, 2, 3	$λ_{1}$ $, λ_{2} {, λ}_{3}$
Topological features
Distance to plane	$d = \frac{\|A x + B y + C z + D\|}{\sqrt{A^{2} + B^{2} + C^{2}}}$
Height below	$h_{b e l o w} = z - H$
Height above	$h_{a b o v e} = H - z$
Radiometric features
Red colour	$\frac{R}{255}$
Green colour	$\frac{G}{255}$
Blue colour	$\frac{B}{255}$
Intensity	$\frac{I}{I_{m a x}}$

Table 4. Statistical metrics obtained for the semantic classification in the best epoch.

	Loss	Accuracy	mIoU	Recall	F1 Score
Training	0.019	0.995	0.672	0.995	0.995
Validation	0.014	0.996	0.751	0.996	0.996

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guerrero-Sevilla, D.; Rodríguez-Gómez, R.; Morcillo-Sanz, A.; Gonzalez-Aguilera, D. Optimising Construction Site Auditing: A Novel Methodology Integrating Ground Drones and Building Information Modelling (BIM) Analysis. Drones 2025, 9, 277. https://doi.org/10.3390/drones9040277

AMA Style

Guerrero-Sevilla D, Rodríguez-Gómez R, Morcillo-Sanz A, Gonzalez-Aguilera D. Optimising Construction Site Auditing: A Novel Methodology Integrating Ground Drones and Building Information Modelling (BIM) Analysis. Drones. 2025; 9(4):277. https://doi.org/10.3390/drones9040277

Chicago/Turabian Style

Guerrero-Sevilla, Diego, Rocío Rodríguez-Gómez, Alberto Morcillo-Sanz, and Diego Gonzalez-Aguilera. 2025. "Optimising Construction Site Auditing: A Novel Methodology Integrating Ground Drones and Building Information Modelling (BIM) Analysis" Drones 9, no. 4: 277. https://doi.org/10.3390/drones9040277

APA Style

Guerrero-Sevilla, D., Rodríguez-Gómez, R., Morcillo-Sanz, A., & Gonzalez-Aguilera, D. (2025). Optimising Construction Site Auditing: A Novel Methodology Integrating Ground Drones and Building Information Modelling (BIM) Analysis. Drones, 9(4), 277. https://doi.org/10.3390/drones9040277

Article Menu

Optimising Construction Site Auditing: A Novel Methodology Integrating Ground Drones and Building Information Modelling (BIM) Analysis

Abstract

1. Introduction

2. Related Works

3. Materials and Methods

3.1. Ground Drone

3.2. Noise Filtering: Anisotropic Filtering

3.2.1. Density Scalar Field

3.2.2. Point Approximation

3.3. Automatic 3D Alignment

3.4. Semantic Classification

3.4.1. Neural Network Architecture

3.4.2. Geometric and Topological Features

4. Results and Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI