FAIS: Fully Automatic Indoor Surveying Framework of Terrestrial Laser Scanning Point Clouds in Large-Scale Indoor Environments

Li, Wenhao; Jia, Tong; Guo, Shiyi; Zhou, Yunchun; Liu, Yizhe; Wang, Hao

doi:10.3390/rs17162863

Open AccessArticle

FAIS: Fully Automatic Indoor Surveying Framework of Terrestrial Laser Scanning Point Clouds in Large-Scale Indoor Environments

by

Wenhao Li

,

Tong Jia

^*,

Shiyi Guo

,

Yunchun Zhou

,

Yizhe Liu

and

Hao Wang

College of Information Science and Engineering, Northeastern University, Shenyang 110819, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(16), 2863; https://doi.org/10.3390/rs17162863

Submission received: 22 June 2025 / Revised: 4 August 2025 / Accepted: 14 August 2025 / Published: 17 August 2025

Download

Browse Figures

Versions Notes

Abstract

This article presents a novel fully automatic indoor surveying (FAIS) framework for large-scale indoor environments using a Terrestrial Laser Scanning (TLS) hardware system. Traditional methods for indoor surveying are labor-intensive and time-consuming, as they rely on manually positioning scanners for data capture and placing markers for registration. What is more, positioning scanners manually may cause uneven scanning or rescanning, including unstructured areas specifically. To ensure full coverage of the scene, we precisely obtain the number and location of scan stations through the Signed Distance Function (SDF) based method. Meanwhile, we propose an efficient large-scale dense point cloud registration method without markers. The proposed framework is adapted to environments where the scanner operates on a flat surface, such as office spaces, theater stage spaces, urban areas, and some cultural heritage scenic areas. Experiments demonstrate that the proposed method decreases computation time and obtains a more complete point cloud.

Keywords:

indoor surveying; TLS observation network planning; signed distance function; point cloud registration

1. Introduction

In the past decade, surveying technology has advanced significantly, with tools evolving from traditional manual instruments to more sophisticated technologies such as LiDAR, RGB-D cameras, and terrestrial laser scanners (TLS). Notably, TLS has found widespread application in various fields, including monitoring building efficiency [1], geodesy for monitoring changes and deformations in man-made structures [2], and vegetation analyses [3]. Although TLS is widely used and highly efficient, there are still many pressing challenges to address in TLS data collection and large-scale point cloud registration. Key challenges include TLS observation network planning [4] and achieving marker-free large-scale point cloud registration [5,6,7]. Specifically, the former involves TLS scanning network [8,9], and planning scan stations path [10]. These challenges have been extensively studied by many outstanding researchers, and their methods have demonstrated significant effectiveness in addressing specific issues. However, these methods are only tailored to individual research issues—such as TLS scanning network, planning scan stations path, large-scale point cloud registration—without unifying them into a fully automatic and comprehensive surveying system.

Regarding the currently popular indoor surveying systems [11], most are either stationary or cart-based. Both types require significant human involvement and are prone to redundant scans and missed areas, often failing to achieve complete scene coverage. Thus, fully automatic surveying is essential. However, it is still challenging to design a fully automatic surveying system, including TLS observation network planning and achieving marker-free alignment of large-scale point clouds.

For fully automatic surveying, TLS observation network planning, and large-scale point cloud registration are core challenges. For the former, a common approach involves defining a discrete and finite set of candidate points and selecting the optimal scan stations from this set. For instance, Soudarissanane et al. [12] employs a Greedy approach to select scan stations from candidate points, while Dehbi et al. [9] uses Integer Linear Programming for the same purpose. Both methods rely on discrete candidate points and are well suited for structured scenes. However, in unstructured scenes, such as cluttered indoor spaces, relying solely on discrete candidate points may result in certain areas, such as corners, being missed during scanning. For the latter, challenges primarily manifest in three aspects: (1) Large-scale point clouds have a larger spatial range, a larger number of point clouds, and a more complex and variable distribution, which makes the registration intractable. (2) Large-scale point clouds introduce more outliers, increasing computational complexity, interfering with feature point extraction, and consequently reducing the accuracy of point cloud registration. (3) Due to the vast number of points in large-scale point clouds, extracting feature points becomes difficult, often resulting in the extraction of many redundant and even incorrect features, which can lead to significant errors in point cloud registration.

To address the above challenges, this article proposes a novel fully automatic indoor surveying framework (FAIS). Specifically, we adopt simultaneous localization and mapping [13] to acquire a 2D realistic and effective model for planning scan positions in unknown indoor structured or unstructured scenes. To achieve complete scene coverage with TLS, meaning covering the entire 2D model, it is necessary to traverse all points in the model. However, traversing all points is highly time-consuming. Inspired by the success of Signed Distance Field (SDF) in mapping [14,15], we adopt SDF to optimize scan stations. Based on the SDF distance values, we identify the point farthest from the boundary as the scan station and update the SDF according to our strategy. This process repeats until no candidate points remain. Here, all points are treated as candidates, and the time complexity of the SDF computation is very low. For large-scale point cloud registration, we draw inspiration from the success of order-of-magnitude reductions in such tasks [16]. To this end, we propose a novel plug-and-play keypoint extraction module (P²KE) tailored for large-scale registration. Our method preserves essential points while maintaining the overall structure of large-scale scenes, achieved through the centroid-based useless points removal (CBUPR) method to filter irrelevant points. Subsequently, we introduce the Fibonacci Voxel Downsampling (FVD) method, which effectively reduces the point cloud to a manageable scale, enabling the application of state-of-the-art registration backbone models. To validate the effectiveness of the FAIS framework, we conduct experiments on both the TLS observation network and large-scale registration tasks. In the TLS observation network experiment, our method demonstrates the ability to achieve complete scene coverage with an optimal number of scan stations in unstructured scenes, while maintaining high computational efficiency. For large-scale registration, we evaluate our method on the 3DMatch dataset [17] and the Whu-TLS dataset [18], leveraging various registration backbone models. Figure 1 illustrates our framework, showcasing FAIS framework in the context of the “Bianliang Yimeng” theater, alongside the implementation process and experimental results.

To sum up, the main contributions and novelties of this article are as follows:

We propose a novel FAIS framework for indoor surveying, capable of automating TLS observation network planning and completing TLS scan tasks. To the best of the author’s knowledge, this is the first attempt to incorporate SDF into a TLS observation network to optimize scan stations.
We present a new plug-and-play module (P²KE) for keypoint extraction, leveraging geometric partition, CBUPR, and FVD methods. This module effectively preserves essential points while filtering out most of the irrelevant ones. Moreover, it significantly reduces the point cloud to a manageable scale, facilitating the application of state-of-the-art registration backbone models.
Extensive experiments are conducted on the TLS observation network and large-scale point cloud registration. For the TLS observation network, we determine the minimal number of scanning stations required to achieve complete scene coverage. For large-scale point cloud registration, we evaluate three backbone models across multiple diverse datasets. The results demonstrate that the proposed P²KE method exhibits strong compatibility and can achieve significant performance improvement for large-scale registration, while performing favorably against its counterparts.

The remainder of this article is organized as follows. Section 2 provides a review of the most relevant related research. Section 3 outlines our core methodology. In Section 4, we discuss the experimental results. In Section 5, we summarize the findings and present the conclusions of this article.

2. Related Work

2.1. TLS Observation Network Planning

Terrestrial Laser Scanning (TLS) has become a leading technique for acquiring large-scale dense point clouds, renowned for its high integrity, density, and accuracy, as well as its flexible station setup. This technique is widely applied in fields such as civil engineering [4], heritage documentation [19], transportation engineering, and landslide monitoring [20], owing to its ability to rapidly capture dense radiometric and geometric data from local scenes. However, due to the scanner’s limited field of view, multiple scans are often required to achieve full coverage of complex scenes or objects. To achieve a high-quality, large-scale, dense point cloud efficiently from multiple scans, effective TLS observation network planning and robust registration methods are essential. Traditionally, scan positions are determined based on operator experience, often resulting in data redundancy, coverage gaps, and inefficiencies that negatively impact cost, workflow, and data quality. This highlights the need for TLS observation network planning, enabling operators to strategically determine optimal scan positions and efficiently acquire high-quality point clouds during field campaigns.

To date, TLS observation network planning methods can be classified into two main categories: model-based methods [4,21] and non-model-based methods [10,22], with the former being the most widely used. Model-based methods consist of three main steps: Firstly, preparing a model of the observation scene. Secondly, simulating TLS scanning. Finally, determining the TLS observation network using a numerical optimization method that considers various constraints, including cost, data coverage, and measurement precision. Model-based methods are categorized based on dimensionality into 2D and 3D methods. The 2D model-based methods derive the TLS observation network through scanning simulations based on a 2D model. This approach has been widely applied to indoor scenes [8,9], existing buildings [23], and forest environments [24]. Dehbi et al. [9] proposed an optimal scan planning approach with enforced network connectivity based on a 2D model for indoor surveying. Li et al. [24] proposed an iterative-mode scan design based on a 2D model for forests. The 3D model-based methods derive the TLS observation network through scanning simulations based on a 3D model. Heidari et al. [25] proposed a new method that finds the optimum locations of TLS, ensuring completeness of data and minimizing the number of scan locations. Biswas et al. [26] proposed an approach that generates automatic laser scanning plans based on a 3D model for indoor surfaces. Wujanz et al. [26] proposed a combinatorial viewpoint planning algorithm that uses a given 3D model as an input and simulates laser scans based on predefined viewpoints for sculpture. However, compared with 2D model methods, 3D model methods require more computational resources and are generally less efficient. For indoor flat surface scan stations planning, 2D model methods are already sufficient to meet the requirements. Therefore, this article proposes a 2D model-based method that ensures complete coverage of the target scene. While Jia et al. [8] and Dehbi et al. [9] applied discrete sampling strategies—using a hierarchical pipeline algorithm or a Mixed-Integer Linear Programming method, respectively—we adopt a continuous sampling approach based on the SDF algorithm’s ability to obtain results with fewer stations. This facilitates the identification of optimal station locations, especially in the presence of complex or irregular model geometries.

2.2. Large-Scale Point Cloud Registration

Large-scale point cloud registration is a difficult task in computer vision and robotics, primarily due to the increased data volume and complexity. Large-scale point cloud registration aims to transform two or more partially overlapping scenes into a unified coordinate system to achieve full coverage of the scene. This method is widely used in various fields, such as surveying and mapping [27], augmented reality [28], 3D reconstruction [29,30,31], and SLAM [28,32].

The main registration methods are divided into two parts, including learning-based methods and hand-crafted methods. Many deep learning methods have been developed for point cloud registration. PointNetLK [33], RPM-NET [34], RPSRNet [35], REGTR [36], and so on perform end-to-end registration by directly predicting the transformation of two point clouds or more. Mathematical function-based models, such as PointGMM [37] and DeepGMR [38], predict transformations between point clouds. The point cloud registration based on hand-crafted methods mainly includes ICP [39], GMM, NDT, and their improved methods [40]. These methods show great potential in object-level and small-scale indoor registration tasks [31,38,41,42].

However, compared with object-level and small-scale indoor registration, large-scale registration is less studied. Several outstanding researchers have achieved some significant effectiveness in large-scale registration. Yang et al. [19] proposed an automatic method for registering terrestrial laser scans in terms of robustness and accuracy. Guan et al. [5] presented a marker-free algorithm for accurately registering multi-scan TLS data in forested areas. Liu et al. [16] proposed a fully end-to-end network for large-scale point cloud registration. However, the first two methods are used for registering specific targets, while the third is suitable for point clouds generated by LiDAR, typically with volumes around 100,000 points. While these methods provide solutions for certain scales of data, a substantial gap, in both computational cost and accuracy, exists when addressing the vastly increased complexity of point clouds of 10 million points or more. Processing point clouds at the scale of 10 million points is significantly more challenging. We propose a P2KE method to extract keypoints for capturing the overall structural information of large-scale scenes. Subsequently, we adopted mainstream point cloud registration frameworks to perform alignment. And the experimental results present the superior performance.

3. Methodology

The flowchart of the proposed method is shown in Figure 2. The method includes four main steps: (1) Generating a 2D model of the target scene and removal of noise. (2) Generating a TLS scanning network based on the SDF algorithm and planning the scan station path. (3) Extracting key points from TLS data. (4) Registering TLS Data.

3.1. Generating the 2D Model

To obtain a 2D model of the target scene, we utilize a movable chassis equipped with a drive system, a single-line LiDAR, an odometer, and an IMU for SLAM. First, we activate the chassis to ensure normal movement while acquiring real-time pose information. Next, we power on the LiDAR and establish communication with the chassis to prepare for subsequent 2D model generation. We then employ the GMapping algorithm [43], which utilizes laser scans and odometry data to generate a map of the target scene. The chassis is manually controlled to traverse the entire area, enabling the acquisition of an initial 2D model.

Due to the relatively low accuracy of single-line LiDAR, the generated 2D model contains much noise, impacting further planning. Therefore, we apply the non-local means (NLM) algorithm [44] to remove noise. The mathematical expression for denoising the 2D model is shown in Equation (1). Here,

I (x)

represents the pixel value of the initial 2D model at position x;

\hat{I} (x)

denotes the pixel value of the denoised 2D model at position x;

Ω

denotes the pixel set of the entire 2D model; Let

ω (x, y)

be the weight function representing the similarity between pixels x and y;

C (x)

denotes the normalization constant that ensures the sum of the weights equals 1.

\hat{I} (x) = \frac{1}{C (x)} \sum_{y \in Ω} I (y) \cdot ω (x, y),

(1)

NLM is a method that leverages the self-similarity of images for noise reduction. Its fundamental principle involves using pixel values from similar regions within the image to eliminate noise while preserving the details and structure of the image. As shown in Figure 3, it presents the initial 2D model and the 2D model after the NLM algorithm. The processed result demonstrates improved quality.

3.2. Generating the TLS Observation Network

The TLS observation network consists of the TLS scanning network and scan stations’ path planning. After obtaining the 2D model, we design a new TLS planning network based on the SDF algorithm that ensures full coverage of the target scene while minimizing the number of planning scan positions. Once the optimal number of planned scan positions is determined, we perform path planning using an improved Dijkstra algorithm [45] to calculate the Euclidean distances between stations. We then transform the station planning problem into a traveling salesman problem (TSP) to determine the shortest distance and optimize the path.

3.2.1. Designing TLS Scanning Network

We employ the Signed Distance Field (SDF) algorithm to design a novel TLS scanning network, making this the first study to apply the SDF algorithm to a TLS scanning network. SDF is a technique that efficiently calculates the distance from a given pixel or voxel to the nearest boundary. In the metric space

Ω

defined by the 2D model, the SDF of a given point x can be computed, where the absolute value represents the distance

d (x, \partial Ω)

between point x and the boundary

\partial Ω

. The sign of this value indicates whether point x is inside or outside the metric space

Ω

. Here, we traverse all pixels in the 2D model to obtain their corresponding SDF values.

SDF (x) = \{\begin{matrix} d (x, \partial Ω), x \in Ω \\ - d (x, \partial Ω), x \notin Ω, \end{matrix}

(2)

Since the metric space calculation uses Manhattan distance, while Euclidean distance is typically preferred in practical applications, a conversion is required between Manhattan and Euclidean distances. The Euclidean distance

d (x, \partial Ω)

between a given point x and the boundary

\partial Ω

is defined as follows:

d (x, \partial Ω) = inf_{y \in \partial Ω} d (x, y),

(3)

Here,

i n f

represents the infimum.

After calculating the SDF values for all pixels in the metric space

Ω

, we need to select an initial scan station, update the scanned area, and iteratively determine the next scan station until the scan area is fully covered. To select suitable scan stations, based on the SDF distance values, we identify the point in the metric space

Ω

that is farthest from the boundary as the scan station S, ensuring that this point satisfies the following constraint. Here,

r_{i g}

represents the minimum blind-angle radius. By adjusting the minimum blind-angle radius, full coverage of the 2D model can be achieved. As shown in Figure 4, the initial scan station

S_{0}

is selected after obtaining the SDF values of the initialized 2D model.

SDF (S) > = r_{i g},

(4)

After selecting the initial scan station

S_{0}

, the SDF must be updated each time a new scan station

S_{i}

is chosen to ensure that the next selected station

S_{i + 1}

is not located near

S_{i}

. There are two strategies for updating the SDF: (a) Take the boundary of the station scanning area as the new SDF boundary. (b) Use the station itself as the new SDF boundary. The former advantage is that the information of the station scanning area is used to ensure that new stations will not appear in a certain station scanning area, and the distance between stations will not be too dense. The former disadvantage is that there may be blind spots that cannot be scanned. Theoretically, these blind spots can be scanned with reasonable station settings, but the new station is located within the scanning area of other stations, so these blind spots cannot be scanned. The latter eliminates blind spots; however, it results in an overly dense station distribution and redundant scanning.

This method absorbs the advantages of the above two strategies by introducing the repulsion coefficient

γ \in [0, 1]

. Adjusting the parameter

γ

enables us to achieve the minimum number of scanning stations under full coverage. In contrast to the discrete sampling strategies in [8,9], our continuous method facilitates the identification of optimal station positions, especially in complex or irregular models. For the boundary, the Euclidean distance is used; for the station

S_{i}

, the Euclidean distance defined by the formula is used:

\bar{d} (S_{i}, x) = \{\begin{matrix} (1 - γ) d (S_{i}, x), x \in ζ_{i} \\ d (S_{i}, x) - F, x \notin ζ_{i} \end{matrix},

(5)

Among them,

ζ_{i}

represents the area that station

S_{i}

can scan, and F represents a constant that makes

d (S_{i}, x)

continuous. When

γ = 1

, it is a complete exclusion strategy, which is the same as the above strategy (a); when

γ = 0

, it is a completely non-repulsion strategy, which is the same as the above strategy (b); when

γ \in (0, 1)

, its effect lies between the two strategies, with a typical value of 0.9.

To illustrate the design principles of the TLS planning network more clearly, the algorithm flow is outlined in Algorithm 1.

Algorithm 1 Laser scanning network

Require:: Input 2D model $Ω$
Ensure:: Output scan stations S
1:: Initialize: Load a map
2:: Clip the map
3:: Calculate the reachable area
4:: Calculate the initial SDF
5:: for Traverse the scanning area do
6:: Select Scan station $S_{i}$ based on SDF
7:: if exit (viable stations) then
8:: Update SDF
9:: Update Path
10:: Enter the station $S_{i}$ into the station list S
11:: else
12:: Solve the station traversal path
13:: Output station list S
14:: Output traversal path
15:: end if
16:: end for
17:: return S

According to Algorithm 1, we traverse the scanning area and determine new scan stations based on each SDF update. Figure 5 illustrates the locations of the new stations after each SDF update, as well as the total positions of all scan stations upon completion of the traversal.

3.2.2. Planning the Scan Stations’ Path

According to Algorithm 1, when updating the SDF, it is necessary to simultaneously update the path from the previous station to the new station. For solving the single-source shortest path problem, Dijkstra’s algorithm is typically used. However, the standard Dijkstra algorithm only computes the shortest Manhattan distance between two points in a 2D map, while Euclidean distance is generally preferred in practical applications. To address this, we implement the following improvements: we create a priority queue Q containing elements as pairs

< d, P >

, sorted in ascending order by d. Here, P represents reachable points in the map, and d denotes the length of a path from the given point

S_{i}

to P. We then relax the adjacent points N of P by optimizing the original path

S_{i} \Rightarrow P \to N

with the new path

S_{i} \to N

.

In this section, we introduce a jump point H and its associated set of obstacle orientations O for solving the problem. When the relaxed point N itself is an obstacle, as shown in Figure 6, the angular range

[ϕ - δ ϕ, ϕ + δ ϕ]

is added to the set O. If the orientation

ϕ

of the relaxed point N lies within the obstacle orientation set O, indicating that an obstacle blocks the path between the jump point H and N, a new jump point H is created and added to the priority queue. Conversely, if the orientation

ϕ

of the relaxed point N is not within the obstacle orientation set O, meaning that H can directly reach N, we relax N using the jump point H and add it to the priority queue.

For a clearer illustration of the path updating process, the algorithm flowchart is presented in Algorithm 2.

After completing the station planning, we need to traverse all the stations, transforming this traversal into a traveling salesman problem (TSP). Unlike the standard TSP, our stations can be revisited. Before solving the TSP, we need to improve the distance calculation algorithm previously used between stations. Due to factors such as floating-point errors, the distances obtained in earlier steps may violate the triangle inequality. To address this, we employ the Floyd algorithm for preprocessing, ensuring that the distances between stations satisfy metric properties, which helps shorten the total traversal length. Although this approach may result in some stations being revisited, it does not impact practical applications.

The traveling salesman problem [46] is an NP-hard combinatorial optimization problem, and its deterministic optimal solution algorithm is only feasible for small-scale instances. In practice, heuristic algorithms are often used, which, while not guaranteeing optimal solutions, can provide satisfactory results within a limited timeframe. We employ the Ant Colony Optimization (ACO) algorithm [47] to solve the problem.

After designing the TLS scanning network and planning the scan station paths, we obtained the TLS observation network, as shown in Figure 7.

Algorithm 2 The path update algorithm

Require:: Input the station $S_{i}$
Ensure:: Update the SDF and path
1:: Create a queue $Q < d, P >$
2:: while Queue Q is not empty do
3:: the current position $P \leftarrow H + r$
4:: if visit P then
5:: Input next station
6:: else
7:: Mark P as visited
8:: Update the SDF and path for P
9:: Relax the feasible connected neighboring points N of point P
10:: Create the set $N$ of N
11:: while Traverse the set $N$ do
12:: Calculate the position $r^{'} = r ∠ ϕ$ between H and N
13:: if N is an obstacle then
14:: $δ ϕ \leftarrow a r c c s c (2 r)$
15:: $O \leftarrow O \cup (ϕ - δ ϕ, ϕ + δ ϕ)$
16:: else
17:: if $ϕ \in O$ then
18:: $Q \leftarrow Q \cup (d, P, O, ϕ)$
19:: else
20:: $d^{'} \leftarrow f (d, r, r^{'})$
21:: $Q \leftarrow Q \cup (d^{'}, H, r^{'}, O)$
22:: end if
23:: end if
24:: end while
25:: end if
26:: end while
27:: return SDF and path

3.3. Plug-and-Play Keypoint Extraction Module

Typically, after generating the TLS observation network, we acquire point clouds using the 3D laser scanning hardware system. Then, we propose a P2KE module to extract keypoints from the collected point clouds.

3.3.1. Geometric Partition

Directly extracting features from dense point clouds in large-scale scenes will greatly increase the complexity of the algorithm. Therefore, This algorithm employs the geometric partitioning method described in [48] to partition the point cloud into metrically simple yet meaningful shapes to reduce algorithm complexity.

We compute linear, planar, and scattering, as well as introduce vertical features for each point of the input point cloud. We also calculate the height of each point as the z coordinate value of the entire normalized point cloud. The geometrically uniform partition is obtained through the global energy function by [49] of the 10-nearest neighbor adjacency graph. And adopt

l_{0}

-cut pursuit algorithm introduced by [50] is able to quickly find an approximate solution with a few graph-cut iterations.

3.3.2. Centroid-Based Useless Points Removal

By using geometric partition, we can obtain a large-scale point cloud composed of N point cloud segments

p_{i}, i \in N

. Among these segments, there are important ones, such as walls, floors, and columns, which typically contain a larger number of points. In contrast, disordered structures yield many segments with fewer points after partitioning. Therefore, we can eliminate unnecessary point cloud segments based on their quantity, retaining only the useful ones.

Due to the varying point cloud structures in each scene, the number of point cloud segments resulting from geometric partition differs across scenarios. To effectively eliminate useless points from all large-scale point clouds, we sort the point cloud segments in descending order according to the number of points contained in each segment and then calculate the centroid of the sequence using the following formula:

C = \frac{1}{T} \sum_{i = 1}^{N} n_{i} \times i,

(6)

Here, C represents the centroid of the sequence, T is the total number of points in the whole point cloud, N is the number of point cloud segments,

n_{i}

denotes the number of points in the i-th segment, and i indicates the position of the i-th segment.

After calculating the centroid of the point cloud, we apply a rounding-up method to convert the floating-point centroid values to integers. We then discard segments with point counts less than the centroid and retain those with point counts equal to or greater than the centroid. This centroid-based outlier removal method allows us to eliminate the majority of unnecessary point cloud segments.

3.3.3. Fibonacci Voxel Downsampling

After useless points removal, we obtain point cloud segments that preserve geometric information. However, directly registering these point cloud segments requires significant computational resources. Therefore, we need to downsample the point clouds while ensuring the preservation of their overall structure.

Common downsampling methods include voxel grid, random sampling, uniform sampling, clustering sampling, curvature-based sampling, and PCA sampling. Among these, random sampling can lead to the loss of important features, while uniform sampling is not suitable for complex geometries. Although clustering, curvature-based, and PCA sampling methods can retain significant geometric features, they tend to be computationally expensive and slower. Additionally, clustering sampling is sensitive to parameter variations, resulting in low robustness; curvature-based sampling is not effective for planar features; and PCA sampling struggles with point clouds that exhibit significant structural changes. Therefore, we chose the voxel grid method for downsampling due to its simplicity and ease of implementation. This approach offers fast computation and effectively preserves the overall structure of the point cloud, meeting our requirements.

When performing downsampling on the point cloud using the voxel grid method, it is essential to ensure that different point clouds are reduced to a similar scale. To achieve this, we need to adjust the voxel size based on the number of points in each cloud. Therefore, we employ the Fibonacci function to adaptively and rapidly adjust the voxel size, allowing us to downsample all point clouds to a unified scale. The specific process is illustrated in the Algorithm 3.

Algorithm 3 Fibonacci Voxel Downsampling

Require:: Input Point clouds of different magnitudes: $P_{s}$
Ensure:: Output Point clouds of the same magnitude: $P_{t}$
1:: fibonacci function f(n)
2:: for point Cloud in point Clouds do
3:: for i in f(20) do
4:: voxelSize = 0.6 ÷ (i + 1)
5:: $p_{t}$ = voxelDownSample(voxelSize)
6:: if len( $p_{t}$ .points) > 10000 then
7:: break
8:: end if
9:: end for
10:: $P_{t}$ .append( $p_{t}$ )
11:: end for
12:: return $P_{t}$

3.4. Registration

After keypoint extraction, we obtain the overall structural information of the large-scale scene. Next, we adopt PointNetLK revisited [33] as an example for point cloud registration. The registration pipeline is illustrated in Figure 8.

3.4.1. Feature Extraction Based on PointNet

To efficiently extract features from point clouds of varying structures, we employ the PointNet algorithm [51]. As illustrated in Figure 8, the PointNet architecture consists of two fully connected layers and a max pooling layer. After keypoint extraction, we set the point cloud size to approximately 10,000 points. Using the extraction network, we obtain 1024 features from this point cloud. The fully connected layers do not alter the number of points, allowing us to derive a weight matrix with the same dimensions, which greatly facilitates the computation of the Jacobian form in the LK algorithm. This is one of the key reasons for choosing PointNet.

3.4.2. Transformation Matrix Estimation Based on LK

After obtaining the features from the point clouds, we need to compute the transformation matrix between the two point clouds. As shown in Figure 8,

P_{T}

and

P_{S}

represent the target point cloud and the source point cloud, respectively. The transformation matrix

G \in S E (3)

can be expressed through the following exponential mapping:

G = e x p (\sum_{i} ϵ_{i} T_{i}), ϵ = {(ϵ_{1}, ϵ_{2}, . . ., ϵ_{6})}^{T},

(7)

where

T_{i}

is the generating matrix of the exponential mapping of the transformation parameters

ϵ \in R^{6}

.

According to the inverse composition (IC) formula of the LK algorithm, we can establish the relationship between

P_{T}

and

P_{S}

as follows:

Φ (P_{T}) = Φ (G \cdot P_{S}),

(8)

where

Φ

represents the PointNet function, and

Φ (P)

is the K-dimensional feature representation vector. By linearizing the above formula, we obtain the following equation:

Φ (P_{S}) = Φ (P_{T}) + \frac{\partial [Φ (G_{- 1} \cdot P_{T})]}{\partial ϵ} \cdot ϵ,

(9)

Calculate the Jacobian matrix:

J = \frac{\partial [Φ (G_{- 1} \cdot P_{T})]}{\partial ϵ},

(10)

where

J \in R^{K \times 6}

Computing J requires an analytical representation of the gradient of the transformation parameters of G.

Based on [33], we can derive the new Jacobian matrix:

J = P o o l (\frac{\partial Z_{L}}{\partial {(P_{T})}^{T}} \frac{\partial (G^{- 1} ϵ \cdot P_{T})}{\partial ϵ^{T}}),

(11)

where

v e c (\cdot)

is the vectorization operator,

Z_{L} = R e L U (B N_{L} (A_{L} Z_{L - 1} + b_{L}))

, A is a matrix transformation, b is the bias term,

B N (\cdot)

is the batch normalization layer,

R e L U (\cdot)

is the element-wise rectified linear unit function, and L is the network’s L-th layer.

In this way, the transformation parameter

ϵ

can be solved:

ϵ = J^{+} [ϕ (P_{S}) - ϕ (P_{T})],

(12)

where

J^{+}

is the generalized inverse of J. The iterative algorithm in this article includes using the equation to iteratively calculate the optimal transformation parameter

ϵ

, and then update the source point cloud

P_{S}

as:

P_{S} \leftarrow Δ G \cdot P_{S} Δ G = e x p (\sum_{i} ϵ_{i} T_{i}),

(13)

The final rigid transformation estimate

G_{e s t}

is the combination of all transformation delta estimates calculated in the iterative loop:

G_{e s t} = Δ G_{n} \dots Δ G_{1} \cdot Δ G_{0},

(14)

The condition for the iteration to stop is that the calculated transformation increment is less than the set minimum threshold of

Δ G

. Type the formula here when the iteration ends. In this way, the entire process builds a relationship model between the transformation matrix and the feature vector.

4. Experiments

In this section, our method is primarily applied to indoor surface environments, and the implementation of quality assessment aims to demonstrate the effectiveness of the solution and the quality of the registered point clouds. Our approach is validated using the 3D laser scanning hardware system shown in Figure 9. From top to bottom, the hardware components of this system include a terrestrial laser scanner, an adjustable lift, and a movable chassis, which is equipped with a drive system, a single-line LiDAR, an odometer, and an IMU. The main function of the movable chassis is to generate a 2D model of the indoor environment and autonomously transport the scanner to the scan stations, thereby completing the planned navigation. Any robotic platform equipped with mapping and navigation capabilities can accomplish this task.

4.1. Datasets, Metrics, and System Experiment Results

4.1.1. Datasets

3DMatch dataset [17] contains 62 scenes, among which 46 are used for training, 8 for validation, and 8 for testing. We use the data preprocessed by [52], and the plug-and-play keypoint extraction module is employed to generate a sparse point cloud dataset that preserves global structural features and planar features for use in training and testing.

Whu-TLS dataset [18] contains 11 scenes, among which we use 7 scenes to train, 2 scenes to validate, and 2 scenes to test. Following the same preprocessing steps as 3DMatch, the plug-and-play keypoint extraction module produces a sparse point cloud that preserves global geometric structures and planar features for training and testing.

4.1.2. Metric

We primarily evaluate the performance of the TLS observation network and point cloud registration. The performance of the TLS observation network is assessed using three metrics: (1) Planning Stations Number (PSN), which represents the number of planned scan stations in the 2D model—the fewer stations required for the same model, the better the performance; (2) Planning Stations Time (PST), which is the time required for station planning in the 2D model—a shorter planning time for the same model indicates better performance; and (3) Planning Paths Time (PPT), which is the time needed for path planning—a shorter path planning time for the same model indicates superior performance. The performance of point cloud registration is evaluated using three metrics: (1) Relative Rotation Error (RRE), the geodesic distance between estimated and ground-truth rotation matrices, (2) Relative Translation Error (RTE), the Euclidean distance between estimated and ground-truth translation vectors, and (3) Registration Recall (RR), the fraction of point cloud pairs whose RRE and RTE are both below predefined thresholds.

4.1.3. System Experiment Results

Using the designed automatic scanning and marker-free registration system, we successfully implemented the TLS observation network and multi-station point cloud registration for the “Bianliang Yimeng” theater. The theater is located within the Qingming Shanghe Garden in Hengdian, Zhejiang, and features a stage with dimensions of 30 m × 26 m. As shown in Figure 10, this is the registered multi-station point cloud of the “Bianliang Yimeng” theater. Red stars indicate the station locations, and blue lines represent the planned station paths.

4.2. Experiment on the Denoised 2D Model and the TLS Observation Network

The TLS observation network consists of two components: the TLS scanning network and the scan station path planning. Our methodology is primarily implemented in C++17 and Python3.8, and it operates within the The Robot Operating System(ROS) framework. The experiments were conducted on a system equipped with a processor clocked at 2.5 GHz, 8 GB of RAM, and 128 GB of storage.

4.2.1. Impact of the Denoised 2D Model

Using a movable chassis equipped with a drive system, a single-line LiDAR, an odometer, and an IMU, we perform indoor mapping and generate a 2D model. Nevertheless, the limited precision of the single-line LiDAR introduces substantial noise into the resulting 2D model. To address this issue, we employed a non-local means denoising method, effectively reducing noise while preserving the details and structural information of the model. As shown in Figure 11, the number of planned scan stations changed under the same scanning parameters: there were 18 stations before denoising, which decreased to 15 after denoising. This outcome indicates a significant improvement in denoising effectiveness.

4.2.2. Impact of the Repulsion Coefficient

The repulsion coefficient

γ

serves as a weight parameter for the two strategies used to update the SDF values. As described in Section 3.2.1, a larger repulsion coefficient increases the weight of the scanning area at the stations, ensuring that the spacing between stations does not become excessively dense. However, this can lead to blind spots that may be missed during scanning; in such cases, reducing the minimum blind spot radius can mitigate this issue. Conversely, when the repulsion coefficient is smaller, although blind spots are less likely to occur, the stations may become too densely packed, resulting in significant overlap in the scanned areas. We utilize the denoised 2D model of the “Bianliang Yimeng” theater, setting the scanning range with a maximum distance of 10 m and a minimum distance of 0.5 m.

At the same time, we set the minimum blind angle radius to 1. Figure 12 presents the number of stations and the corresponding station planning times for exclusion coefficients of 1, 0.9, 0.7, and 0.5. From Figure 12, it is evident that the optimal number of scan stations occurs when the repulsion coefficient is set to 0.9. However, when updating stations, new locations may fall within the scanning range of existing stations, leading to potential blind spots that remain unscanned. To address this issue, we need to adjust the minimum blind spot radius to plan new stations and eliminate these blind spots.

4.2.3. Impact of the Minimum Blind-Angle Radius

When the repulsion coefficient is high, it helps maintain sufficient spacing between scan stations; however, it may also lead to overlooked blind spots. For instance, the brown areas in Figure 12a,b indicate these blind spots. To address this issue, we introduced a minimum blind spot radius as a criterion for determining whether the SDF update is complete. Figure 13 illustrates the scan positions of stations when the minimum blind spot radius is either greater than or less than the maximum SDF value of the blind spot region. Figure 13 demonstrates that setting the minimum blind spot radius to a value smaller than the maximum SDF value of the blind spot area effectively resolves the issue of blind spots. This configuration ensures effective coverage of blind spots while maintaining sufficient overlap between scan stations, optimizing the scanning process, and facilitating subsequent registration tasks.

4.2.4. Scanning Station Path Planning Experimental Results

After planning the scan stations, our movable chassis also requires path planning to complete the subsequent scanning tasks. We model the path planning problem as a traveling salesman problem (TSP) and integrate the Floyd algorithm for preprocessing, allowing for repeated visits to stations. Figure 14 illustrates the scan station path calculated using the ant colony algorithm, with arrows indicating the order of station traversal. This path planning enables us to efficiently complete the scanning process.

We also compared our approach with an example from [8,9]. Figure 15a illustrates the floor plan of the MacEwan Student Centre at the University of Calgary, which spans approximately 120 m by 90 m. Restricted areas were addressed by prohibiting the instantiation of station point candidates within them, following the approach described in the [9]. It does not represent obstacles, allowing the laser beam to pass through. They correspond to the gray shaded areas in Figure 15a. For station planning, they adopted either a hierarchical pipeline [8] or a Mixed-Integer Linear Programming [9] method, whereas we adopted the SDF method to obtain fewer scan stations. Figure 15 illustrates the comparison results for the MacEwan Student Centre, where our approach requires the smallest number of stations.

4.2.5. Performance Analysis of TLS Observation Network Runtime

In addition to ensuring coverage of scan stations and efficient path planning, the TLS observation network must also complete the simulation planning in a timely manner. As illustrated in Table 1, the results show that we can effectively plan scan stations and the paths between them within a short period. This demonstrates the efficiency of our approach in achieving operational speed. Here, RC refers to the repulsion coefficient, PST stands for planning station time, and PPT represents planning path time.

As illustrated in Table 2, computation time was compared with the approach in [9] using the model of the MacEwan Student Center. Their method required 150 s, whereas ours completed in 34.8 s. While their system featured a 4.0 GHz CPU and 128 GB RAM, ours had only a 2.5 GHz CPU, 8 GB RAM, and 128 GB storage. Despite these hardware limitations, our approach demonstrated superior efficiency.

4.3. The Results of the Keypoints Extraction Experiment

As is well known, large-scale dense point clouds can reach tens of millions or even hundreds of millions of points. Achieving precise registration directly with traditional algorithms, such as ICP and its variants, can be challenging due to the sheer volume of data. Therefore, it is essential to extract keypoints from the vast number of points in order to facilitate subsequent registration processes. Below are the experimental results of our keypoint feature extraction.

4.3.1. The Results of the Geometric Segmentation Experiment

Geometric segmentation enables more accurate identification and separation of complex-shaped point clouds by analyzing geometric features, which greatly facilitates subsequent keypoint extraction. Therefore, we applied geometric segmentation to divide the large-scale point cloud. As shown in Figure 16, the segmentation results for a single scan station of the “Bianliang Yimeng” theater are shown.

4.3.2. The Results of Centroid-Based Useless Points Removal Experiment

Typically, noise in point clouds consists of scattered, low-density points. To address this, we calculated the centroids of the segmented point cloud clusters and removed all points that fell below the centroid density threshold. This process effectively eliminated a significant portion of the irrelevant point cloud data. As shown in Figure 17, these are the results of our outlier removal experiment.

4.3.3. The Results of Fibonacci Voxel Downsampling Experiment

The Fibonacci voxel downsampling experiment based on the Fibonacci sequence demonstrated an effective reduction in point cloud density while preserving the overall geometric structure. By dynamically adjusting the voxel size according to the Fibonacci sequence, we achieved consistent downsampling across varying scales of point clouds. In our experiments, we downsampled the point clouds to a magnitude of 10,000. The results of the downsampling process are shown in Figure 18.

4.4. The Results of Registration Experiment

We adopt ICP, REGTR, and PointNetLK revisited as backbones for point cloud registration. We conduct quantitative comparison experiments across different datasets. For a more comprehensive comparison, we chose the small-scale indoor dataset 3DMatch and the large-scale TLS dataset Whu-TLS as testing datasets. At the same time, we adopt the random downsampling method to compare with our P²KE method. In the 3DMatch dataset, we extract keypoints from the point cloud at the thousand-point level for registration, while in the Whu-TLS dataset, we extract keypoints from the point cloud at the ten-thousand-point level for registration. The specific experimental results are as follows.

As a small-scale indoor dataset, 3DMatch contains 62 scenes, of which 46 are used for training, 8 for validation, and 8 for testing. We use the testing data preprocessed by [52] and evaluate it using the 3DMatch [52] protocols. The comparison results of the 3DMatch dataset, shown in Table 3, reveal that our method outperforms the others, and the REGTR method demonstrates superior accuracy and robustness.

The comparison results of time, shown in Table 4, reveal that our method outperforms the others.

As a large-scale TLS dataset, Whu-TLS comprises 115 point clouds and over 1.74 billion 3D points collected from 11 different environments (i.e., subway station, high-speed railway platform, mountain, forest, park, campus, residence, riverbank, heritage building, underground excavation, and tunnel environments) with variations in point density, clutter, and occlusion. We use the dataset for testing. The comparison results of the Whu-TLS dataset, shown in Table 5, reveal that our method outperforms the others, and the PointNetLK revisited method demonstrates superior accuracy and robustness.

Additionally, as shown in Figure 19, the point clouds before and after registration demonstrate significant improvement. The left side shows the point cloud before registration, while the right side displays the point cloud after registration. The purple represents the original point cloud, the orange indicates the target point cloud, and the green represents the target point cloud after registration.

5. Limitations

In the FAIS framework, the TLS observation network employs the SDF algorithm for station planning. While this approach ensures 100% coverage of the scanned space, the structural information pertaining to the interior roof is disregarded in this process. Future work will explore applying SDF-based station planning to complete 3D models. For point cloud registration, consistent point cloud counts are required due to the varying scales of the training datasets. However, large-scale datasets can cause huge memory usage and computational cost during training. To address this, we apply different downsampling ratios for different datasets to achieve a balance between performance and efficiency.

In addition, the framework is built on a movable chassis architecture. However, due to hardware constraints, it is challenging to operate in non-planar environments, such as those with staircases or uneven flooring. To address this limitation, we plan to employ a climbing-capable chassis or an unmanned aerial vehicle (UAV) as the carrier to enable the implementation of large-scale surveying algorithms across a wider range of scenarios.

6. Conclusions

This article presents a novel, fully automatic indoor surveying framework for large-scale indoor environments. This is the first attempt to incorporate SDF into a TLS observation network to optimize scan stations for large-scale environments. And we present a new plug-and-play module for keypoint extraction. Utilizing the developed 3D scanning system, the entire process of station planning, scanning, and large-scale point cloud registration can be fully automated. This approach effectively addresses issues such as unscanned areas and excessive overlap between scan stations, resulting in a complete, large-scale point cloud representation. It significantly reduces the time required for station scanning and enhances registration robustness, ultimately saving both labor and time costs.

For future work, we will continue to optimize the station planning algorithm by extending it to 3D models. This will involve utilizing three-dimensional spatial information to refine the positioning of stations. Additionally, we plan to collect more point cloud scenes to provide a larger dataset for large-scale point cloud registration. By updating the keypoint extraction algorithm, we aim to achieve even better registration results.

Author Contributions

Conceptualization, W.L. and T.J.; methodology, W.L. and S.G.; validation, S.G.; formal analysis, W.L. and Y.Z.; investigation, T.J.; resources, H.W.; writing—original draft preparation, W.L.; writing—review and editing, S.G. and H.W.; visualization, S.G.; supervision, Y.L.; project administration, W.L. and S.G.; funding acquisition, T.J. and S.G. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Key Research and Development Project of China under Grant 2022YFF0902401, in part by the National Natural Science Foundation of China under Grant U22A2063 and 62173083, in part by the Major Program of National Natural Science Foundation of China under Grant 71790614, in part by the 111 Project under Grant B16009, and in part by the Liaoning Provincial “Selecting the Best Candidates by Opening Competition Mechanism” Science and Technology Program under Grant 2023JH1/10400045; the Fundamental Research Funds for the Central Universities N2404024.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The small-scale indoor dataset used in this study is 3DMatch and is freely accessible at https://3dmatch.cs.princeton.edu/, accessed on 17 May 2024. The large-scale TLS dataset is Whu-TLS and need to apply at https://docs.google.com/forms/d/e/1FAIpQLSfyK4ulGLzWC0igA1L4NXeR-quSqOSfEH5I-DBYUXoDPC6Aqg/viewform, accessed on 24 August 2024.

Acknowledgments

The authors thank the providers of point cloud, including 3DMatch and Whu-TLS, for supporting the experiments conducted in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Rashidi, M.; Mohammadi, M.; Sadeghlou Kivi, S.; Abdolvand, M.M.; Truong-Hong, L.; Samali, B. A decade of modern bridge monitoring using terrestrial laser scanning: Review and future directions. Remote Sens. 2020, 12, 3796. [Google Scholar] [CrossRef]
Jia, D.; Zhang, W.; Liu, Y. Systematic approach for tunnel deformation monitoring with terrestrial laser scanning. Remote Sens. 2021, 13, 3519. [Google Scholar] [CrossRef]
Muumbe, T.P.; Baade, J.; Singh, J.; Schmullius, C.; Thau, C. Terrestrial laser scanning for vegetation analyses with a special focus on savannas. Remote Sens. 2021, 13, 507. [Google Scholar] [CrossRef]
Chen, Z.; Zhang, W.; Huang, R.; Dong, Z.; Chen, C.; Jiang, L.; Wang, H. 3D model-based terrestrial laser scanning (TLS) observation network planning for large-scale building facades. Autom. Constr. 2022, 144, 104594. [Google Scholar] [CrossRef]
Guan, H.; Su, Y.; Sun, X.; Xu, G.; Li, W.; Ma, Q.; Wu, X.; Wu, J.; Liu, L.; Guo, Q. A marker-free method for registering multi-scan terrestrial laser scanning data in forest environments. ISPRS J. Photogramm. Remote Sens. 2020, 166, 82–94. [Google Scholar] [CrossRef]
Wang, H.; Liu, Y.; Dong, Z.; Guo, Y.; Liu, Y.S.; Wang, W.; Yang, B. Robust multiview point cloud registration with reliable pose graph initialization and history reweighting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 9506–9515. [Google Scholar]
Li, S.; Zhu, J.; Xie, Y.; Hu, N.; Wang, D. Matching distance and geometric distribution aided learning multiview point cloud registration. IEEE Robot. Autom. Lett. 2024, 9, 9319–9326. [Google Scholar] [CrossRef]
Jia, F.; Lichti, D.D. A Model-Based Design System for Terrestrial Laser Scanning Networks in Complex Sites. Remote Sens. 2019, 11, 1749. [Google Scholar] [CrossRef]
Dehbi, Y.; Leonhardt, J.; Oehrlein, J.; Haunert, J.H. Optimal scan planning with enforced network connectivity for the acquisition of three-dimensional indoor models. ISPRS J. Photogramm. Remote Sens. 2021, 180, 103–116. [Google Scholar] [CrossRef]
Achakir, F.; El Fkihi, S.; Mouaddib, E.M. Non-Model-Based approach for complete digitization by TLS or mobile scanner. ISPRS J. Photogramm. Remote Sens. 2021, 178, 314–327. [Google Scholar] [CrossRef]
Otero, R.; Lagüela, S.; Garrido, I.; Arias, P. Mobile indoor mapping technologies: A review. Autom. Constr. 2020, 120, 103399. [Google Scholar] [CrossRef]
Soudarissanane, S.; Lindenbergh, R. Optimizing terrestrial laser scanning measurement set-up. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2012, 38, 127–132. [Google Scholar] [CrossRef]
Yin, J.; Luo, D.; Yan, F.; Zhuang, Y. A novel LiDAR-assisted monocular visual SLAM framework for mobile robots in outdoor environments. IEEE Trans. Instrum. Meas. 2022, 71, 8503911. [Google Scholar] [CrossRef]
Oleynikova, H.; Millane, A.; Taylor, Z.; Galceran, E.; Nieto, J.; Siegwart, R. Signed distance fields: A natural representation for both mapping and planning. In Proceedings of the RSS 2016 Workshop: Geometry and Beyond-Representations, Physics, and Scene Understanding for Robotics, Ann Arbor, MI, USA, 19 June 2016. [Google Scholar]
Reijgwart, V.; Millane, A.; Oleynikova, H.; Siegwart, R.; Cadena, C.; Nieto, J. Voxgraph: Globally consistent, volumetric mapping using signed distance function submaps. IEEE Robot. Autom. Lett. 2019, 5, 227–234. [Google Scholar] [CrossRef]
Liu, J.; Wang, G.; Liu, Z.; Jiang, C.; Pollefeys, M.; Wang, H. Regformer: An efficient projection-aware transformer network for large-scale point cloud registration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 4–6 October 2023; pp. 8451–8460. [Google Scholar]
Zeng, A.; Song, S.; Nießner, M.; Fisher, M.; Xiao, J.; Funkhouser, T. 3dmatch: Learning local geometric descriptors from rgb-d reconstructions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1802–1811. [Google Scholar]
Dong, Z.; Liang, F.; Yang, B.; Xu, Y.; Zang, Y.; Li, J.; Wang, Y.; Dai, W.; Fan, H.; Hyyppä, J.; et al. Registration of large-scale terrestrial laser scanner point clouds: A review and benchmark. ISPRS J. Photogramm. Remote Sens. 2020, 163, 327–342. [Google Scholar] [CrossRef]
Yang, B.; Zang, Y. Automated registration of dense terrestrial laser-scanning point clouds using curves. ISPRS J. Photogramm. Remote Sens. 2014, 95, 109–121. [Google Scholar] [CrossRef]
Zhang, W.; Chen, Z.; Huang, R.; Dong, Z.; Jiang, L.; Xia, Y.; Chen, B.; Wang, H. Multiobjective optimization-based terrestrial laser scanning layout planning for landslide monitoring. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5702115. [Google Scholar] [CrossRef]
Aryan, A.; Bosché, F.; Tang, P. Planning for terrestrial laser scanning in construction: A review. Autom. Constr. 2021, 125, 103551. [Google Scholar] [CrossRef]
Monica, R.; Aleotti, J. Surfel-based next best view planning. IEEE Robot. Autom. Lett. 2018, 3, 3324–3331. [Google Scholar] [CrossRef]
Chen, M.; Koc, E.; Shi, Z.; Soibelman, L. Proactive 2D model-based scan planning for existing buildings. Autom. Constr. 2018, 93, 165–177. [Google Scholar] [CrossRef]
Li, L.; Mu, X.; Soma, M.; Wan, P.; Qi, J.; Hu, R.; Zhang, W.; Tong, Y.; Yan, G. An iterative-mode scan design of terrestrial laser scanning in forests for minimizing occlusion effects. IEEE Trans. Geosci. Remote Sens. 2020, 59, 3547–3566. [Google Scholar] [CrossRef]
Heidari Mozaffar, M.; Varshosaz, M. Optimal placement of a terrestrial laser scanner with an emphasis on reducing occlusions. Photogramm. Rec. 2016, 31, 374–393. [Google Scholar] [CrossRef]
Biswas, H.K.; Bosché, F.; Sun, M. Planning for scanning using building information models: A novel approach with occlusion handling. In Proceedings of the 32nd International Symposium on Automation and Robotics in Construction and Mining. International Association for Automation and Robotics in Construction, Oulu, Finland, 15–18 June 2015; pp. 1–8. [Google Scholar]
Tabib, W.; Goel, K.; Yao, J.; Boirum, C.; Michael, N. Autonomous cave surveying with an aerial robot. IEEE Trans. Robot. 2021, 38, 1016–1032. [Google Scholar] [CrossRef]
Huang, X.; Qu, W.; Zuo, Y.; Fang, Y.; Zhao, X. IMFNet: Interpretable multimodal fusion for point cloud registration. IEEE Robot. Autom. Lett. 2022, 7, 12323–12330. [Google Scholar] [CrossRef]
Lin, Z.H.; Huang, S.Y.; Wang, Y.C.F. Convolution in the cloud: Learning deformable kernels in 3d graph convolution networks for point cloud analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 1800–1809. [Google Scholar]
Sun, L. RANSIC: Fast and highly robust estimation for rotation search and point cloud registration using invariant compatibility. IEEE Robot. Autom. Lett. 2021, 7, 143–150. [Google Scholar] [CrossRef]
Zhao, H.; Zhuang, H.; Wang, C.; Yang, M. G3DOA: Generalizable 3D descriptor with overlap attention for point cloud registration. IEEE Robot. Autom. Lett. 2022, 7, 2541–2548. [Google Scholar] [CrossRef]
Liu, J.; Liu, Y.; Meng, Z. Point cloud registration leveraging structural regularity in Manhattan world. IEEE Robot. Autom. Lett. 2022, 7, 7888–7895. [Google Scholar] [CrossRef]
Li, X.; Pontes, J.K.; Lucey, S. Pointnetlk revisited. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 12763–12772. [Google Scholar]
Yew, Z.J.; Lee, G.H. Rpm-net: Robust point matching using learned features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11824–11833. [Google Scholar]
Ali, S.A.; Kahraman, K.; Reis, G.; Stricker, D. Rpsrnet: End-to-end trainable rigid point set registration network using barnes-hut 2d-tree representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13100–13110. [Google Scholar]
Yew, Z.J.; Lee, G.H. Regtr: End-to-end point cloud correspondences with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 6677–6686. [Google Scholar]
Hertz, A.; Hanocka, R.; Giryes, R.; Cohen-Or, D. Pointgmm: A neural gmm network for point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 12054–12063. [Google Scholar]
Yuan, W.; Eckart, B.; Kim, K.; Jampani, V.; Fox, D.; Kautz, J. Deepgmr: Learning latent gaussian mixture models for registration. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part V 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 733–750. [Google Scholar]
Besl, P.J.; McKay, N.D. Method for registration of 3-D shapes. In Proceedings of the Sensor Fusion IV: Control Paradigms and Data Structures, Boston, MA, USA, 12–15 November 1991; SPIE: Bellingham, WA, USA, 1992; Volume 1611, pp. 586–606. [Google Scholar]
Li, L.; Mei, S.; Ma, W.; Liu, X.; Li, J.; Wen, G. An Adaptive Point Cloud Registration Algorithm Based on Cross Optimization of Local Feature Point Normal and Global Surface. IEEE Trans. Autom. Sci. Eng. 2023, 21, 6434–6447. [Google Scholar] [CrossRef]
Lin, Z.H.; Zhang, C.Y.; Lin, X.M.; Lin, H.; Zeng, G.H.; Chen, C.P. Low-Overlap Point Cloud Registration via Correspondence Augmentation. IEEE Trans. Autom. Sci. Eng. 2024, 22, 9363–9375. [Google Scholar] [CrossRef]
Guo, S.; Tang, F.; Liu, B.; Fu, Y.; Wu, Y. An Accurate Outlier Rejection Network with Higher Generalization Ability for Point Cloud Registration. IEEE Robot. Autom. Lett. 2023, 8, 4649–4656. [Google Scholar] [CrossRef]
Grisetti, G.; Stachniss, C.; Burgard, W. Improved techniques for grid mapping with rao-blackwellized particle filters. IEEE Trans. Robot. 2007, 23, 34–46. [Google Scholar] [CrossRef]
Buades, A.; Coll, B.; Morel, J.M. A non-local algorithm for image denoising. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 2, pp. 60–65. [Google Scholar]
Dijkstra, E.W. A note on two problems in connexion with graphs. In Edsger Wybe Dijkstra: His Life, Work, and Legacy; Association for Computing Machinery: New York, NY, USA, 2022; pp. 287–290. [Google Scholar]
Little, J.D.; Murty, K.G.; Sweeney, D.W.; Karel, C. An algorithm for the traveling salesman problem. Oper. Res. 1963, 11, 972–989. [Google Scholar] [CrossRef]
Dorigo, M.; Birattari, M.; Stutzle, T. Ant colony optimization. IEEE Comput. Intell. Mag. 2006, 1, 28–39. [Google Scholar] [CrossRef]
Landrieu, L.; Simonovsky, M. Large-scale point cloud semantic segmentation with superpoint graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4558–4567. [Google Scholar]
Guinard, S.; Landrieu, L. Weakly supervised segmentation-aided classification of urban scenes from 3D LiDAR point clouds. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2017, 42, 151–157. [Google Scholar] [CrossRef]
Landrieu, L.; Obozinski, G. Cut pursuit: Fast algorithms to learn piecewise constant functions on general weighted graphs. SIAM J. Imaging Sci. 2017, 10, 1724–1766. [Google Scholar] [CrossRef]
Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
Huang, S.; Gojcic, Z.; Usvyatsov, M.; Wieser, A.; Schindler, K. Predator: Registration of 3d point clouds with low overlap. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 4267–4276. [Google Scholar]

Figure 1. Overview of the proposed fully automatic indoor surveying framework. The movable chassis collects a 2D model of the theater and performs denoising, using the TLS observation network to determine scan locations and plan paths in unstructured scenes. Keypoints are extracted from the point clouds, which are then aligned using registration algorithms. Finally, creating a complete point cloud model of the “Bianliang Yimeng” theater.

Figure 2. The flowchart of our method includes four parts: 2D model generation and denoised, TLS observation network planning, keypoint extraction, and registration.

Figure 3. Comparison of denoised and original 2D model of the “Bianliang Yimeng” theater.

Figure 4. Initial scan station determined based on the SDF values.

Figure 5. The locations of the stations obtained after each SDF update, as well as the positions of all stations upon completion of the traversal. The symbol “*” denotes the current station’s position, while the symbol “x” denotes all stations’ positions.

Figure 6. The relationship between the relaxed point and the obstacle orientation.

Figure 7. Planning the scan stations’ path. The symbol “x” denotes all stations’ positions.

Figure 8. The pipeline of large-scale point cloud registration.

Figure 9. The hardware system physical diagram, viewed from top to bottom, consists of a terrestrial laser scanner, an adjustable lift, and a movable chassis.

Figure 10. The multi-station point cloud registration results for the “Bianliang Yimeng” theater were obtained through the automatic scanning and marker-free registration system. Blue lines represent the planned station paths and red stars indicate the station locations.

Figure 11. Comparison of denoising effects on the 2D model. The station count was reduced from 19 to 15, demonstrating the effectiveness of our noise suppression approach. The symbol “x” denotes all stations’ positions.

Figure 12. The number of scan stations and scanning time under different repulsion coefficients are shown from (a–d), representing station count and time with repulsion coefficients of 1, 0.9, 0.7, and 0.5, respectively. In the figures, Planning Station Number (PSN) denotes the number of stations, and Planning Station Time (PST) denotes the time. The pink regions denote the scanned areas, while the orange regions denote the unscanned areas.

Figure 13. When the repulsion coefficient is set to 0.9 and the minimum blind-angle radius is 0.8, the scanning process is able to cover the blind spots effectively. The pink regions denote the scanned areas, while the orange regions denote the unscanned areas.

Figure 14. The path planning results for scan stations with a repulsion coefficient of 0.9 and a minimum blind-angle radius of 0.8 are shown.

Figure 15. Comparison of the results for the coverage of MacEwan Student Center, University of Calgary using the heuristic method from [8]: (a) is the result of [8] and needs 24 stations; (b) is the result of [9] and needs 23 stations; and our method, illustrated in (c), needs only 16 stations.

Figure 16. The results of the geometric segmentation. (a) is original point cloud. The colors in (b) are chosen randomly for each element of the partition.

Figure 17. The results of the outlier removal. Useless partitions are removed through centroid-based useless points removal method. The colors in (a,b) are chosen randomly for each element of the partition.

Figure 18. The results of the Fibonacci voxel downsampling. The overall structural information of large-scale scene has been successfully achieved. The colors in (a,b) are chosen randomly for each element of the partition.

Figure 19. Example visualizations of registration. Color-coded point cloud visualization: Orange represents the source point cloud, purple denotes the target reference cloud, and green indicates the registered point cloud.

Table 1. In the 2D model of the “Bianliang Yimeng” theater, planning station time and planning path time under different repulsion coefficients are shown.

RC	1.0	0.9	0.7	0.5
PST	10.60 s	12.87 s	59.14 s	153.24 s
PPT	11.09 s	13.50 s	63.72 s	165.10 s

Table 2. In the 2D model of the MacEwan Student Center, compare computation time with [9].

Method	[9] Method	Our Method
Time	150 s	34.8 s

Table 3. Accuracy comparison with different methods on the 3DMatch dataset. Numbers in bold indicate superior performance.

Method	RTE (cm)	RRE (°)	RR (%)
randomDownsample-ICP	1.078	37.877	17.2
P²KE-ICP (Ours)	1.063	37.265	18.4
randomDownsample-PointNetLK revisited	1.287	47.107	2.7
P²KE-PointNetLK revisited (Ours)	1.183	45.862	2.8
randomDownsample-REGTR	0.070	2.010	78.2
P²KE-REGTR (Ours)	0.060	2.256	81.1

Table 4. Registration time results with different methods on the 3DMatch dataset. Numbers in bold indicate superior performance.

Method	T (s)
ICP	35.164
randomDownsample-ICP	2.2316
P²KE-ICP (Ours)	1.6303
REGTR	0.141
randomDownsample-REGTR	0.0949
P²KE-REGTR (Ours)	0.0918
PointNetLK revisited	0.1749
randomDownsample-PointNetLK revisited	0.1626
P²KE-PointNetLK revisited (Ours)	0.1663

Table 5. Accuracy comparison with different methods on the Whu-TLS dataset. Numbers in bold indicate superior performance.

Method	RTE (cm)	RRE (°)	RR (%)
sift-REGTR	0.234	64.905	20.6
randomDownsample-REGTR	0.263	38.60	15.9
P²KE-REGTR (Ours)	0.250	31.695	35.3
sift-pointNetLK revisited	0.13	53.22	23.5
randomDownsample-pointNetLK revisited	0.21	70.62	2.1
P²KE-pointNetLK revisited (Ours)	0.11	38.94	31.3

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, W.; Jia, T.; Guo, S.; Zhou, Y.; Liu, Y.; Wang, H. FAIS: Fully Automatic Indoor Surveying Framework of Terrestrial Laser Scanning Point Clouds in Large-Scale Indoor Environments. Remote Sens. 2025, 17, 2863. https://doi.org/10.3390/rs17162863

AMA Style

Li W, Jia T, Guo S, Zhou Y, Liu Y, Wang H. FAIS: Fully Automatic Indoor Surveying Framework of Terrestrial Laser Scanning Point Clouds in Large-Scale Indoor Environments. Remote Sensing. 2025; 17(16):2863. https://doi.org/10.3390/rs17162863

Chicago/Turabian Style

Li, Wenhao, Tong Jia, Shiyi Guo, Yunchun Zhou, Yizhe Liu, and Hao Wang. 2025. "FAIS: Fully Automatic Indoor Surveying Framework of Terrestrial Laser Scanning Point Clouds in Large-Scale Indoor Environments" Remote Sensing 17, no. 16: 2863. https://doi.org/10.3390/rs17162863

APA Style

Li, W., Jia, T., Guo, S., Zhou, Y., Liu, Y., & Wang, H. (2025). FAIS: Fully Automatic Indoor Surveying Framework of Terrestrial Laser Scanning Point Clouds in Large-Scale Indoor Environments. Remote Sensing, 17(16), 2863. https://doi.org/10.3390/rs17162863

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

FAIS: Fully Automatic Indoor Surveying Framework of Terrestrial Laser Scanning Point Clouds in Large-Scale Indoor Environments

Abstract

1. Introduction

2. Related Work

2.1. TLS Observation Network Planning

2.2. Large-Scale Point Cloud Registration

3. Methodology

3.1. Generating the 2D Model

3.2. Generating the TLS Observation Network

3.2.1. Designing TLS Scanning Network

3.2.2. Planning the Scan Stations’ Path

3.3. Plug-and-Play Keypoint Extraction Module

3.3.1. Geometric Partition

3.3.2. Centroid-Based Useless Points Removal

3.3.3. Fibonacci Voxel Downsampling

3.4. Registration

3.4.1. Feature Extraction Based on PointNet

3.4.2. Transformation Matrix Estimation Based on LK

4. Experiments

4.1. Datasets, Metrics, and System Experiment Results

4.1.1. Datasets

4.1.2. Metric

4.1.3. System Experiment Results

4.2. Experiment on the Denoised 2D Model and the TLS Observation Network

4.2.1. Impact of the Denoised 2D Model

4.2.2. Impact of the Repulsion Coefficient

4.2.3. Impact of the Minimum Blind-Angle Radius

4.2.4. Scanning Station Path Planning Experimental Results

4.2.5. Performance Analysis of TLS Observation Network Runtime

4.3. The Results of the Keypoints Extraction Experiment

4.3.1. The Results of the Geometric Segmentation Experiment

4.3.2. The Results of Centroid-Based Useless Points Removal Experiment

4.3.3. The Results of Fibonacci Voxel Downsampling Experiment

4.4. The Results of Registration Experiment

5. Limitations

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI