Robust Extraction of 3D Line Segment Features from Unorganized Building Point Clouds

: As one of the most common features, 3D line segments provide visual information in scene surfaces and play an important role in many applications. However, due to the huge, unstructured, and non-uniform characteristics of building point clouds, 3D line segment extraction is a complicated task. This paper presents a novel method for extraction of 3D line segment features from an unorga-nized building point cloud. Given the input point cloud, three steps were performed to extract 3D line segment features. Firstly, we performed data pre-processing, including subsampling, ﬁltering and projection. Secondly, a projection-based method was proposed to divide the input point cloud into vertical and horizontal planes. Finally, for each 3D plane, all points belonging to it were projected onto the ﬁtting plane, and the α -shape algorithm was exploited to extract the boundary points of each plane. The 3D line segment structures were extracted from the boundary points, followed by a 3D line segment merging procedure. Corresponding experiments demonstrate that the proposed method works well in both high-quality TLS and low-quality RGB-D point clouds. Moreover, the robustness in the presence of a high degree of noise is also demonstrated. A comparison with state-of-the-art techniques demonstrates that our method is considerably faster and scales signiﬁcantly better than previous ones. To further verify the effectiveness of the line segments extracted by the proposed method, we also present a line-based registration framework, which employs the extracted 2D-projected line segments for coarse registration of building point clouds.


Introduction
In recent years, benefiting from the advance in sensor technology, light detection and ranging (LiDAR) technology has been rapidly developed. The increasing prevalence of 3D laser scanning equipment has resulted in a dramatic explosion in the availability of 3D point cloud data, and point clouds containing tens of millions of samples are now commonplace. However, due to the unstructured, irregular, and non-uniform characteristics of raw point cloud data, there is an ever-growing demand for concise and meaningful abstractions of point cloud data. As one of the most important features for human perception, 3D line segment structures are widely used and play an important role in many areas, such as building outline extraction [1], building model reconstruction [2,3], road extraction [4], registration [5], localization [6], calibration [7], and more. Over the past few decades, detecting line segments from 2D images has been well-studied [8][9][10][11][12], while the works on 3D line segment extraction are still insufficient. Moreover, unlike 2D images whose pixels are closely related to their neighboring pixels, unorganized point clouds lack connectivity information. It is difficult to detect 3D line segments in unstructured, huge, and inhomogeneous point clouds, and many previous approaches failed when faced with fast and accurate 3D line segment extraction from large-scale building point clouds.
A primary reason is that the definition of "line segment" is difficult to formalize with various rules because scenes in the real world are very diverse, and a perceptible line segment is actually relevant to complicated factors, such as sudden changes in curvature or color, sharp edge, plane intersection, etc. As pointed out in [13], most 3D line segment extraction methods consist of two main steps: (1) convert the input point cloud into images first, and then apply the line segment detector to extract 2D line segments on the images; (2) backproject these 2D line segments to the original coordinate system to obtain the final 3D line segments. However, these methods have high requirements on image quality and are very sensitive to noise. In this paper, we neither perform line segment extraction directly on the original point cloud nor consider strategies based on projection and back-projection. Instead, considering that line segments are closely related to planes, we propose a novel method for extraction of 3D line segment features from building point clouds based on planar features. The key idea of our method consists of three phases.
First, given the input point cloud, data pre-processing including subsampling, filtering, and projection is performed. Second, a projection-based method is proposed to divide the input point cloud into vertical and horizontal planes, which mainly includes three steps: collinear point detection, clustering, and interior point extraction. Finally, for each 3D plane, a robust plane fitting method is adopted to estimate its corresponding equation. Subsequently, the improved α-shape algorithm is exploited to extract boundary points of each plane. The 3D line segment structures are extracted from the boundary points.

Related Work
Three-dimensional line segment extraction is a long-standing, yet active topic. In the past decades, numerous methods have been developed by researchers from different fields, the most important of which can be broadly classified into four categories: point featurebased, multi-view image-based, plane feature-based, and deep learning-based methods.
Point feature-based methods: The point feature-based methods usually detect boundary or contour points first, and then use fitting methods such as least squares to extract line segments. Various kinds of features such as the Gauss Map [14], the curvature variation [15], the normal difference [16], and the features based on eigenvalue decomposition [17] have been proposed for boundary or contour point recognition. In [14], given an unorganized point cloud, Gaussian graph clustering was first computed to eliminate points that are unlikely to belong to edges, and in the second stage, a more precise iterative selection of the remaining feature points was performed to improve reliability and robustness in the presence of obtuse and acute angles. In [15], sharp edge features were detected by analyzing the eigenvalues of the covariance matrix defined by each point's k-nearest neighbors, then the qualitative and quantitative evaluations were evaluated using several dihedral angles and well-known examples of unorganized point clouds. In [16], a novel multi-scale operator named the difference of normal (DoN) for unorganized 3D point clouds was introduced. The normal of each point was calculated in different scales, and only those with higher normal difference were selected as boundary points. Hackel et al. [17] predicted the contour score for each individual point with a binary classifier by using a set of features extracted from the point's neighborhood. The contour scores serve as a basis to construct an overcomplete graph of candidate contours. Jarvis [18] improved the convex hull formation algorithm to allow some concavities in the extracted shape. Zhang et al. [19] proposed the 3D guided multi-conditional GAN (3D-GMcGAN) to extract contour points of outdoor scene data consistent with visual perception. In their solution, first, a parametric space-based framework is proposed via a novel similarity measurement of two parametric models. Second, a guided learning framework is designed to assist finding the target contour distribution via an initial guidance. Chen and Yu [20] presented a novel method for the generation and regularization of feature lines, which consists of two main steps: extraction of the outline points according to the property of vector distribution and cluster, feature PointNet [30] opened a deep learning framework that performs directly on unstructured point clouds. PointNet++ [31] extended PointNet for local region computation by applying PointNet hierarchically in local regions, and many state-of-the-art methods have been developed since then [32]. Inspired by recent point cloud learning networks, Yu et al. [33] first proposed the method of contour point extraction based on deep learning theory. The proposed framework was specially designed to manipulate points grouped in local patch sets, and trained to learn consolidated points. Zhang et al. [19] proposed the 3D guided multi-conditional GAN (3D-GMcGAN) to extract contour points of outdoor scene data consistent with visual perception. Subjected to limited samples, the method based on deep learning for linear structure extraction can only extract the contour lines consistent with the marked samples. However, real-world scenes are very complex, and limited manually labeled samples are not enough to identify diverse contour or edge points.
As discussed above, the drawback of the point feature-based methods is that the extracted feature points are sensitive to noise, which easily leads to omission or misidentification. The method based on multi-view images is easily affected by the image quality, and is more suitable for point cloud data in a simple scene collected from a single view angle. The disadvantage of the plane-based methods is that it is more suitable for processing larger planes, while is less effective for small planes. Deep learning-based methods are highly dependent on labeled samples. Since this paper is mainly for 3D line segment information extraction of artificial buildings, and these artificial buildings contain many large facades and horizontal planes, it is reasonable to adopt the plane-based method. However, plane segmentation is not a simple task, as the point clouds are unstructured and often massive. Numerous previous techniques are computationally expensive and do not scale well with the size of the datasets [34]. In order to overcome some defects of traditional plane segmentation algorithms, such as low efficiency, over-segmentation, under-segmentation, uncertainty, etc., a new plane segmentation framework for building point cloud is proposed. Subsequently, 3D line segments are extracted based on these planes, followed by a 3D line segment merging procedure.

Contribution
The main contributions of our work are summarized as follows: (1) A 3D line segment extraction method for building point clouds is proposed, which is conceptually simple and easy to implement. The proposed method only uses the original point cloud, and does not require the images or reconstructed models; (2) The multiple constraints in the algorithm filter out a lot of irrelevant debris, which means that the proposed method can restore detailed information more accurately, and greatly reduces the misidentification of 3D line segments; (3) The proposed method transforms the problem of structure line detection in 3D space into 2D space, and most operations are mainly carried out in 2D space, hence the algorithm proposed in this paper is very efficient; (4) The 3D line segment extraction method can be flexibly combined with other plane segmentation methods. The method can simultaneously detect 2D and 3D line segments without changing the model structure; (5) The proposed method works well in both high-quality TLS and low-quality RGB-D point clouds, and is robust to noise.
The rest of this paper is organized as follows: detailed algorithms of the proposed method are presented in Section 2. In Section 3, the experiments are described, experimental results are demonstrated, performance comparison is performed, and discussions are provided. Finally, conclusions and recommendations are provided in Section 4.

Materials and Methods
To overcome some of the above-listed deficits, a new 3D line segment extraction method is proposed, which is conceptually simple and easy to implement. The proposed method consists of three phases, including data pre-processing, plane segmentation and 3D Remote Sens. 2022, 14, 3279 5 of 32 line segment extraction (see Figure 1). Considering that man-fabricated buildings contain a large number of facades and horizontal planes, the projections of these planes in specific directions are densely distributed linearly. Therefore, once we obtain the exact positions of the projected line segments, the spatial geometric equations of the corresponding planes are also determined, and the plane interior points can be easily extracted using these equations. However, for the large-scale point clouds, which are acquired by the current variety of LiDAR systems, they also bring a great challenge in data processing. To improve operational efficiency, appropriate data cropping and subsampling are necessary. In the first phase, provided the input point cloud, we perform data pre-processing including subsampling, filtering, and projection. In the second phase, a projection-based method is proposed to divide the input point cloud into vertical and horizontal planes, which mainly includes three steps: collinear point detection, clustering, and interior point extraction. Finally, for each 3D plane, a robust plane fitting method based on Principal Component Analysis (PCA) is adopted to estimate its corresponding equation. Subsequently, the improved α-shape algorithm is exploited to extract boundary points of each plane. The 3D line segment structures are extracted from the boundary points, followed by a 3D line segment merging procedure.

Materials and Methods
To overcome some of the above-listed deficits, a new 3D line segment extraction method is proposed, which is conceptually simple and easy to implement. The proposed method consists of three phases, including data pre-processing, plane segmentation and 3D line segment extraction (see Figure 1). Considering that man-fabricated buildings contain a large number of facades and horizontal planes, the projections of these planes in specific directions are densely distributed linearly. Therefore, once we obtain the exact positions of the projected line segments, the spatial geometric equations of the corresponding planes are also determined, and the plane interior points can be easily extracted using these equations. However, for the large-scale point clouds, which are acquired by the current variety of LiDAR systems, they also bring a great challenge in data processing. To improve operational efficiency, appropriate data cropping and subsampling are necessary. In the first phase, provided the input point cloud, we perform data pre-processing including subsampling, filtering, and projection. In the second phase, a projection-based method is proposed to divide the input point cloud into vertical and horizontal planes, which mainly includes three steps: collinear point detection, clustering, and interior point extraction. Finally, for each 3D plane, a robust plane fitting method based on Principal Component Analysis (PCA) is adopted to estimate its corresponding equation. Subsequently, the improved α-shape algorithm is exploited to extract boundary points of each plane. The 3D line segment structures are extracted from the boundary points, followed by a 3D line segment merging procedure.

Data Pre-Processing
Considering that the building point clouds contain a large number of plane structures, especially the vertical and horizontal planes, the paper proposes a projection-based plane segmentation method. Based on the fact that the projection of the facade on the X-Y plane and the projection of the horizontal plane on the X-Z plane are linearly distributed, we first performed the following data pre-processing steps before 3D structure line extraction (see Figure 2).

Data Pre-Processing
Considering that the building point clouds contain a large number of plane structures, especially the vertical and horizontal planes, the paper proposes a projection-based plane segmentation method. Based on the fact that the projection of the facade on the X-Y plane and the projection of the horizontal plane on the X-Z plane are linearly distributed, we first performed the following data pre-processing steps before 3D structure line extraction (see Figure 2).

Figure 2.
A demonstration of the raw data pre-processing flow. Point cloud is colored in the Z-axis direction.
Subsampling: Building point clouds are usually very dense and massive, and operating directly on the raw data is not only time-consuming but may also lead to computational failures. Therefore, proper subsampling is necessary because it does not affect the linear distribution of plane projections in the building, and the computational efficiency is greatly improved due to the reduction in the amount of data. Porig is the input point cloud, a subsampling threshold is set as the min distance between two points, and then Porig is uniformly subsampled to ensure that the distance between any two points is not less than . The subsampled point cloud is represented as Psub. As shown in Figure 3, although the number of points in the original point cloud was drastically reduced from 27,484,853 to 44,541, the part of the facade projected to the ground is still densely linearly distributed and the average point distance is 0.018 m.  Filtering: As shown in Figure 3, the horizontal plane projected to the X-Y plane presents a scattered distribution, which is undoubtedly not conducive to the subsequent line segment extraction. Therefore, it is necessary to remove the horizontal plane before the elevation extraction. Here, we propose a statistical-based plane removal method. As shown in Figure 4, we propose a statistical-based plane removal method. As shown in Figure 4, taking a real indoor scene as an example, the sampled indoor point cloud contains four main horizontal planes. According to a certain step size (e.g., 0.05 m), the distribution histogram of Psub along the Z direction is calculated. It can be seen that the Subsampling: Building point clouds are usually very dense and massive, and operating directly on the raw data is not only time-consuming but may also lead to computational failures. Therefore, proper subsampling is necessary because it does not affect the linear distribution of plane projections in the building, and the computational efficiency is greatly improved due to the reduction in the amount of data. P orig is the input point cloud, a subsampling threshold ε 1 is set as the min distance between two points, and then P orig is uniformly subsampled to ensure that the distance between any two points is not less than ε 1 . The subsampled point cloud is represented as P sub . As shown in Figure 3, although the number of points in the original point cloud was drastically reduced from 27,484,853 to 44,541, the part of the facade projected to the ground is still densely linearly distributed and the average point distance is 0.018 m.  Subsampling: Building point clouds are usually very dense and massive, and operating directly on the raw data is not only time-consuming but may also lead to computational failures. Therefore, proper subsampling is necessary because it does not affect the linear distribution of plane projections in the building, and the computational efficiency is greatly improved due to the reduction in the amount of data. Porig is the input point cloud, a subsampling threshold is set as the min distance between two points, and then Porig is uniformly subsampled to ensure that the distance between any two points is not less than . The subsampled point cloud is represented as Psub. As shown in Figure 3, although the number of points in the original point cloud was drastically reduced from 27,484,853 to 44,541, the part of the facade projected to the ground is still densely linearly distributed and the average point distance is 0.018 m.  Filtering: As shown in Figure 3, the horizontal plane projected to the X-Y plane presents a scattered distribution, which is undoubtedly not conducive to the subsequent line segment extraction. Therefore, it is necessary to remove the horizontal plane before the elevation extraction. Here, we propose a statistical-based plane removal method. As shown in Figure 4, we propose a statistical-based plane removal method. As shown in Figure 4, taking a real indoor scene as an example, the sampled indoor point cloud contains four main horizontal planes. According to a certain step size (e.g., 0.05 m), the distribution histogram of Psub along the Z direction is calculated. It can be seen that the Filtering: As shown in Figure 3, the horizontal plane projected to the X-Y plane presents a scattered distribution, which is undoubtedly not conducive to the subsequent line segment extraction. Therefore, it is necessary to remove the horizontal plane before the elevation extraction. Here, we propose a statistical-based plane removal method. As shown in Figure 4, we propose a statistical-based plane removal method. As shown in Figure 4, taking a real indoor scene as an example, the sampled indoor point cloud contains four main horizontal planes. According to a certain step size (e.g., 0.05 m), the distribution histogram of P sub along the Z direction is calculated. It can be seen that the histogram Remote Sens. 2022, 14, 3279 7 of 32 contains four main peaks corresponding to floor, bed, and secondary ceiling, in the Z direction (see Figure 5). To take advantage of this remarkable feature, the main horizontal plane can be roughly removed by eliminating points within a certain range of the peaks. The filtered point cloud is denoted as P filt . histogram contains four main peaks corresponding to floor, bed, and secondary ceiling, in the Z direction (see Figure 5). To take advantage of this remarkable feature, the main horizontal plane can be roughly removed by eliminating points within a certain range of the peaks. The filtered point cloud is denoted as Pfilt.  Projection: Next, Pfilt is projected onto the X-Y plane via the following equation: where a = b = d = 0, and c = 1. The output point cloud after pre-processing is denoted as Pproj.

Plane Segmentation
Collinear point detection: The shadows of walls and other vertical planes in Pproj are distributed in dense linear patterns. Next, line segments need to be extracted from Pproj. Although Pproj is a 2D point cloud data (Z coordinate equal to 0), it is still regarded as a 3D point cloud during the line segment extraction process. Once the line segments are accurately detected, the corresponding vertical plane positions are also determined. In practice, the Random Sample Consensus (RANSAC) algorithm [35] is particularly effective for line detection in the following two aspects: (1) It is easily extensible and straightforward to implement. (2) It can robustly deal with point clouds containing a lot of noise. Although Pproj is a 2D dataset (the z coordinate of each point is 0), in the line segment detection algorithm, it is considered as a 3D point cloud data. histogram contains four main peaks corresponding to floor, bed, and secondary ceiling, in the Z direction (see Figure 5). To take advantage of this remarkable feature, the main horizontal plane can be roughly removed by eliminating points within a certain range of the peaks. The filtered point cloud is denoted as Pfilt.  Projection: Next, Pfilt is projected onto the X-Y plane via the following equation: where a = b = d = 0, and c = 1. The output point cloud after pre-processing is denoted as Pproj.

Plane Segmentation
Collinear point detection: The shadows of walls and other vertical planes in Pproj are distributed in dense linear patterns. Next, line segments need to be extracted from Pproj. Although Pproj is a 2D point cloud data (Z coordinate equal to 0), it is still regarded as a 3D point cloud during the line segment extraction process. Once the line segments are accurately detected, the corresponding vertical plane positions are also determined. In practice, the Random Sample Consensus (RANSAC) algorithm [35] is particularly effective for line detection in the following two aspects: (1) It is easily extensible and straightforward to implement. (2) It can robustly deal with point clouds containing a lot of noise. Although Pproj is a 2D dataset (the z coordinate of each point is 0), in the line segment detection algorithm, it is considered as a 3D point cloud data. Projection: Next, P filt is projected onto the X-Y plane via the following equation: where a = b = d = 0, and c = 1. The output point cloud after pre-processing is denoted as P proj .

Plane Segmentation
Collinear point detection: The shadows of walls and other vertical planes in P proj are distributed in dense linear patterns. Next, line segments need to be extracted from P proj . Although P proj is a 2D point cloud data (Z coordinate equal to 0), it is still regarded as a 3D point cloud during the line segment extraction process. Once the line segments are accurately detected, the corresponding vertical plane positions are also determined. In practice, the Random Sample Consensus (RANSAC) algorithm [35] is particularly effective for line detection in the following two aspects: (1) It is easily extensible and straightforward to implement. (2) It can robustly deal with point clouds containing a lot of noise. Although P proj is a 2D dataset (the z coordinate of each point is 0), in the line segment detection algorithm, it is considered as a 3D point cloud data. As shown in Figure 6, RANSAC performs line detection iteratively by randomly choosing two points A (x a , y a , z a ) and B (x b , y b , z b ) (since at least two points are required to determine a line). The line L defined by A (x a , y a , z a ) and B (x b , y b , z b ) can be expressed as: A distance threshold is set as the maximum offset of a line when performing the RANSAC algorithm, and is usually associated with the average point distance daver of Pproj. For each point qi in Pproj, a neighborhood search is performed to find its neighboring points (denoted as qi1, qi2,…, qik). daver is calculated as: where N is the point number of Pproj. For efficiency, k is usually set to 2 based on previous experiences. Then the number of points lie on the line defined by A (xa, ya, za) and B (xb, yb, zb) is counted. The maximum number of iterations for detecting a line is set to . After a given number of trials, a set of collinear points Pcol with a maximal score is extracted. However, as shown in Figure 7a, the RANSAC algorithm cannot be used directly in our application. The first reason is that some spurious lines were generated due to the random nature of RANSAC. Second, RANSAC can only distinguish those collinear points. When facing Suppose that there is a plane ψ c passing through point P (x p , y p , z p ) and taking the direction → AB as its normal vector. C (x c , y c , z c ) is the intersection of ψ c and L. Then each coordinate value of point C (x c , y c , z c ) can be calculated as follows: Based on the fact that CP⊥AB, t can be expressed as: The distance d pc between P (x p , y p , z p ) and C (x c , y c , z c ) is calculated as: A distance threshold ε 2 is set as the maximum offset of a line when performing the RANSAC algorithm, and ε 2 is usually associated with the average point distance d aver of P proj . For each point q i in P proj , a neighborhood search is performed to find its neighboring points (denoted as q i1 , q i2 , . . . , q ik ). D aver is calculated as: where N is the point number of P proj . For efficiency, k is usually set to 2 based on previous experiences. Then the number of points lie on the line defined by A (x a , y a , z a ) and B (x b , y b , z b ) is counted. The maximum number of iterations for detecting a line is set to ε 3 . After a given number of trials, a set of collinear points P col with a maximal score is extracted. However, as shown in Figure 7a, the RANSAC algorithm cannot be used directly in our application. The first reason is that some spurious lines were generated due to the random nature of RANSAC. Second, RANSAC can only distinguish those collinear points. When facing with points that are collinear but not continuous, it becomes powerless. Third, it is difficult to ascertain the appropriate convergence conditions. , x FOR PEER REVIEW 9 of 32 with points that are collinear but not continuous, it becomes powerless. Third, it is difficult to ascertain the appropriate convergence conditions. Clustering: To overcome some of the deficits listed above, an adjusted Euclidean clustering method is adopted to split Pcol into separated segments (if necessary). A distance threshold is defined as the farthest distance to determine whether two adjacent points lie on the same line segments. Suppose that there are two subsets PI and PJ in Pcol, and PI ∩ PJ = Ø. The two subsets are considered to belong to different line segments via the following condition: The detailed clustering process is as follows: (1) The kd-tree structure is used to manage Pcol. An empty cluster list E and a pending queue Q are established. Then each point in Pcol is added into Q; (2) For each point Qi in Q, a neighborhood search is performed on it and the searched points are stored in Qi k . For each point in Qi k , the distance dik between it and Qi is calculated. If dik ≤ , the corresponding point will be stored in E along with Qi; (3) The distances between all the clusters in E are calculated according to Equation (6), and the clusters that are less than apart from each other are merged. The merging process iterates until all the distances between clusters are greater than ; (4) If the clustering results are not empty, the results are stored in Pseg. For each subset Pseg i in Pseg, its length Lseg i and point number Nseg i are calculated. A threshold is set as the minimum number of points per meter. A threshold is set as the shortest distance of a qualified line segment. If Nseg i /Lseg i ≥ and Lseg i ≥ , Pseg i is stored as a qualified line segment. Then all qualified line segments in Pseg are removed from Pproj; (5) The remaining points are invoked as input data, and the next cycle is continued. The iteration ceases when the qualified clustering results are null for five consecutive times. On the other hand, to make the program converge as quickly as possible, when the remaining points are less than a specified percentage (e.g., 5%) of the input data Pproj, the iteration is also forced to end. The extracted qualified line segments are stored in Pqual (see Figure 7b).

Interior point extraction:
For each qualified line segment Pqual i in Pqual, the endpoint Clustering: To overcome some of the deficits listed above, an adjusted Euclidean clustering method is adopted to split P col into separated segments (if necessary). A distance threshold ε 4 is defined as the farthest distance to determine whether two adjacent points lie on the same line segments. Suppose that there are two subsets P I and P J in P col , and P I ∩ P J = Ø. The two subsets are considered to belong to different line segments via the following condition: The detailed clustering process is as follows: (1) The kd-tree structure is used to manage P col . An empty cluster list E and a pending queue Q are established. Then each point in P col is added into Q; (2) For each point Q i in Q, a neighborhood search is performed on it and the searched points are stored in Q i k . For each point in Q i k , the distance d ik between it and Q i is calculated. If d ik ≤ ε 4 , the corresponding point will be stored in E along with Q i ; (3) The distances between all the clusters in E are calculated according to Equation (6), and the clusters that are less than ε 4 apart from each other are merged. The merging process iterates until all the distances between clusters are greater than ε 4 ; (4) If the clustering results are not empty, the results are stored in P seg . For each subset P seg i in P seg , its length L seg i and point number N seg i are calculated. A threshold ε 5 is set as the minimum number of points per meter. A threshold ε 6 is set as the shortest distance of a qualified line segment. If N seg i /L seg i ≥ ε 5 and L seg i ≥ ε 6 , P seg i is stored as a qualified line segment. Then all qualified line segments in P seg are removed from P proj ; (5) The remaining points are invoked as input data, and the next cycle is continued. The iteration ceases when the qualified clustering results are null for five consecutive times.
On the other hand, to make the program converge as quickly as possible, when the remaining points are less than a specified percentage (e.g., 5%) of the input data P proj , the iteration is also forced to end. The extracted qualified line segments are stored in P qual (see Figure 7b).

Interior point extraction:
For each qualified line segment P qual i in P qual , the endpoint coordinates are obtained as E (x s , y s , 0) and F (x e , y e , 0). The spatial geometric equation of the corresponding elevation ψ v can be expressed as: The distance d iv between a spatial point and the façade ψ v defined by Equation (8) is calculated as: For each point in P orig , the distance d iv from it to ψ v is calculated according to Equation (8). If d iv ≤ ε 2 , the corresponding point is stored as a coplanar point. To obtain more precise plane segmentation results, another two planes, ψ s and ψ e , are designed which both use direction → EF as their normal vector and pass through the endpoint coordinates E (x s , y s , 0) and F (x e , y e , 0), respectively. The two equations of ψ s and ψ e are defined as: For each point in the coplanar points P cop , the distance values from d is and d ie to ψ s and ψ e are calculated, respectively. If d is + d ie > d se , the corresponding point is removed from the coplanar points. Note that d se denotes the distance between ψ s and ψ e . The coplanar points after preliminary filtering are denoted as P filt . To avoid accidentally extracting points on the ground and ceiling, calculating the minimum value Z min and the maximum value Z max of P orig in the Z-axis direction is necessary, and points whose z coordinates are greater than Z maxε 2 or less than Z min + ε 2 are excluded from P filt . P vert is the final filtered coplanar points, and P vert is output as a vertical plane. All qualified line segments are used to extract their corresponding vertical planes (see Figure 8a). For efficiency, all points in the vertical planes are removed from P orig . P rem is the remaining points (see Figure 8b). As shown in Figure 8b, the remaining points are mainly composed of some horizontal planes such as floor, ceiling, etc. Similar to the steps above, after pre-processing, P rem is used to extract the horizontal plane P hor (see Figure 8c). Note that P rem is projected onto the X-Z plane. Finally, the extracted planes are merged (see Figure 8d).
The distance between a spatial point and the facade defined by Equation (8) is calculated as: For each point in Porig, the distance div from it to is calculated according to Equation (8). If div ≤ , the corresponding point is stored as a coplanar point. To obtain more precise plane segmentation results, another two planes, and , are designed which both use direction ⃗ as their normal vector and pass through the endpoint coordinates E (xs, ys, 0) and F (xe, ye, 0), respectively. The two equations of and are defined as: For each point in the coplanar points Pcop, the distance values from dis and die to and are calculated, respectively. If dis + die > dse, the corresponding point is removed from the coplanar points. Note that dse denotes the distance between and . The coplanar points after preliminary filtering are denoted as Pfilt. To avoid accidentally extracting points on the ground and ceiling, calculating the minimum value Zmin and the maximum value Zmax of Porig in the Z-axis direction is necessary, and points whose z coordinates are greater than Zmax − or less than Zmin + are excluded from Pfilt. Pvert is the final filtered coplanar points, and Pvert is output as a vertical plane. All qualified line segments are used to extract their corresponding vertical planes (see Figure 8a). For efficiency, all points in the vertical planes are removed from Porig. Prem is the remaining points (see Figure  8b). As shown in Figure 8b, the remaining points are mainly composed of some horizontal planes such as floor, ceiling, etc. Similar to the steps above, after pre-processing, Prem is used to extract the horizontal plane Phor (see Figure 8c). Note that Prem is projected onto the X-Z plane. Finally, the extracted planes are merged (see Figure 8d).

Three-Dimensional Line Segment Extraction
After plane segmentation, the -shape algorithm is exploited to extract boundary points of each plane. In general, it is a method of abstracting the edges of discrete point sets. If that there is a finite set of discrete points S, the principle can be imagined as a circle with a radius of rolling outside the point set S. The circle passes through any two points in S, if there are no other points in this circle, the two points are considered to be boundary points [36]. The boundary points are extracted by iteratively detecting such circles. However, as shown in Figure 9a, the surface of the input data contains a lot of noise, such as a wall with a certain thickness. No matter how the parameters are adjusted, the -shape algorithm cannot extract the boundary points of the point cloud data. The purpose is to reliably extract 3D line segments for indoor point clouds, even under adverse conditions such as heavy noise; therefore, before boundary point extraction, it is necessary to process the data to make it suitable for the α-shape algorithm. A robust plane fitting method based on Principal Component Analysis (PCA) is adopted.

Plane position determination:
Pplane is one of the above extraction planes. The spatial plane equation of points in Pplane is: where A, B, and C are normal vectors of the plane and satisfy + + = 1.

Three-Dimensional Line Segment Extraction
After plane segmentation, the α-shape algorithm is exploited to extract boundary points of each plane. In general, it is a method of abstracting the edges of discrete point sets. If that there is a finite set of discrete points S, the principle can be imagined as a circle with a radius of ε 7 rolling outside the point set S. The circle passes through any two points in S, if there are no other points in this circle, the two points are considered to be boundary points [36]. The boundary points are extracted by iteratively detecting such circles. However, as shown in Figure 9a, the surface of the input data contains a lot of noise, such as a wall with a certain thickness. No matter how the parameters are adjusted, the α-shape algorithm cannot extract the boundary points of the point cloud data. The purpose is to reliably extract 3D line segments for indoor point clouds, even under adverse conditions such as heavy noise; therefore, before boundary point extraction, it is necessary to process the data to make it suitable for the α-shape algorithm. A robust plane fitting method based on Principal Component Analysis (PCA) is adopted.

Three-Dimensional Line Segment Extraction
After plane segmentation, the -shape algorithm is exploited to extract boundary points of each plane. In general, it is a method of abstracting the edges of discrete point sets. If that there is a finite set of discrete points S, the principle can be imagined as a circle with a radius of rolling outside the point set S. The circle passes through any two points in S, if there are no other points in this circle, the two points are considered to be boundary points [36]. The boundary points are extracted by iteratively detecting such circles. However, as shown in Figure 9a, the surface of the input data contains a lot of noise, such as a wall with a certain thickness. No matter how the parameters are adjusted, the -shape algorithm cannot extract the boundary points of the point cloud data. The purpose is to reliably extract 3D line segments for indoor point clouds, even under adverse conditions such as heavy noise; therefore, before boundary point extraction, it is necessary to process the data to make it suitable for the α-shape algorithm. A robust plane fitting method based on Principal Component Analysis (PCA) is adopted.

Plane position determination:
Pplane is one of the above extraction planes. The spatial plane equation of points in Pplane is: where A, B, and C are normal vectors of the plane and satisfy + + = 1. Plane position determination: P plane is one of the above extraction planes. The spatial plane equation of points in P plane is: where A, B, and C are normal vectors of the plane and satisfy A 2 + B 2 + C 2 = 1. The distance d i from a point p i on P plane to the plane defined by Equation (11) can be expressed as: where x i , y i , and z i are coordinate values of point p i . To obtain the best fitting plane, the objective function F 0 should meet the following conditions: Using the Lagrange multiplier method to solve the minimum value of the objective function F 0 , the following functions are first composed: where λ is the undetermined value.
Taking the partial derivative of Equation (14) with respect to D and making the derivative result 0, the following relationship can be obtained: Then the following relationship can be further obtained: Equation (12) can be rewritten as follows: Using Equation (14) to calculate the partial derivatives of A, B, and C, respectively, the following relationship can be obtained: where The covariance matrix M 3×3 is constructed as follows: Remote Sens. 2022, 14, 3279 13 of 32 The covariance matrix M 3×3 is always symmetric positive semi-definite [37]. Based on this characteristic, → e 1 , → e 2 , and → e 3 are the three eigenvectors of M 3×3 , respectively. Λ 1 ≥ λ 2 ≥ λ 3 are the three eigenvalues of M 3×3 . M 3×3 can be decomposed as follow: In practice, → e 3 = (α, β, γ) is often used as the normal of P plane i . However, the eigenvalue method directly uses all points in P plane for plane fitting, in which the fitting parameters obtained are not optimal. To obtain more accurate fitting parameters, first, for each point in P plane i the distance d jp to the plane defined by → e 3 and p is calculated, and the standard deviation σ is computed as: If d jp > 2 σ, the corresponding points are removed from P plane , and the remaining points are used to recalculate the value of → e 3 and p. For efficiency, the number of iterations is set to ε 8 empirically. After ε 8 iterations, the exact values of plane normal vectors A, B, and C are obtained, and then the value of D is obtained by using Equation (16).
As the accurate fitting parameters for each plane are obtained, the equation is provided by: Figure 9b, after projection, a plane point cloud with random noise becomes very flat. Next, the α-shape algorithm is used to extract boundary points P edge of each plane (see Figure 10b). Since the average point distance of each plane is not always the same or even very different, ε 7 is associated with the average point distance of S to achieve a good edge extraction effect. The covariance matrix × is always symmetric positive semi-definite [37]. Based on this characteristic, ⃗, ⃗, and ⃗ are the three eigenvectors of × , respectively. λ ≥ λ ≥ λ are the three eigenvalues of × . × can be decomposed as follow:

Contour point extraction: As shown in
In practice, ⃗ = ( , , ) is often used as the normal of Pplane i . However, the eigenvalue method directly uses all points in Pplane for plane fitting, in which the fitting parameters obtained are not optimal. To obtain more accurate fitting parameters, first, for each point in Pplane i the distance djp to the plane defined by ⃗ and ̅ is calculated, and the standard deviation is computed as: If djp > 2 , the corresponding points are removed from Pplane, and the remaining points are used to recalculate the value of ⃗ and ̅ . For efficiency, the number of iterations is set to empirically. After iterations, the exact values of plane normal vectors A, B, and C are obtained, and then the value of D is obtained by using Equation (16).
As the accurate fitting parameters for each plane are obtained, the equation is provided by: Figure 9b, after projection, a plane point cloud with random noise becomes very flat. Next, the α-shape algorithm is used to extract boundary points Pedge of each plane (see Figure 10b). Since the average point distance of each plane is not always the same or even very different, is associated with the average point distance of S to achieve a good edge extraction effect. After projecting the plane onto itself, the α-shape algorithm is used to extract boundary points Pedge of each plane (see Figure 10b). Since the average point distance of each plane is not always the same or even very different, is associated with the average After projecting the plane onto itself, the α-shape algorithm is used to extract boundary points P edge of each plane (see Figure 10b). Since the average point distance of each plane is not always the same or even very different, ε 7 is associated with the average point distance of S to achieve a good edge extraction effect. A distance threshold ε 9 is set as the maximum offset of a line when performing RANSAC. The parameters remain unchanged except that ε 2 is replaced by ε 9 .

Contour point extraction: As shown in
Line segment extraction: Then the line segment detection method described above is used to extract 3D line segments in P edge (see Figure 10c). As shown in Figure 11a, the extracted 3D line segments may contain line segments that are close to each other. To make the extraction results more concise, the 3D line segment merging procedure is proposed. First, the extracted 3D line segments are sorted in descending order according to their lengths. For each sorted segment, the angle θ ij between it and another line segment is calculated. → Vi and → V j are the normal vectors of the two 3D line segments to be compared (the normal vector of each line segment is estimated by RANSAC in advance), and θ ij is calculated as: (23) used to extract 3D line segments in Pedge (see Figure 10c). As shown in Figure 11a, the extracted 3D line segments may contain line segments that are close to each other. To make the extraction results more concise, the 3D line segment merging procedure is proposed. First, the extracted 3D line segments are sorted in descending order according to their lengths. For each sorted segment, the angle between it and another line segment is calculated. ⃗ and ⃗ are the normal vectors of the two 3D line segments to be compared (the normal vector of each line segment is estimated by RANSAC in advance), and is calculated as: If < 5° or > 175°, the two segments are considered to be approximately parallel. Note that 5° is an empirical value for judging parallel lines. Once the parallelism of two 3D line segments is checked, the corresponding pair will be marked and will not be compared repeatedly.
Second, for each parallel pair, the orthographic distances from the two endpoints of the longer line segment to the other one is calculated. If both distance values are less than 0.05 m, the two segments are considered to be approximately collinear. Note that 0.05 m is an empirical value for judging collinear lines.
Finally, the endpoints of the two 3D line segments are checked to ascertain whether there is overlapping part. The collinear segments with overlapping are considered to be merged. If a line segment is shorter than the other, only the longer one is retained. The rest is output as the final 3D line segment extraction result (see Figure 11b).

Experiments and Analysis
Considering that the plane segmentation involved in this paper are mainly horizontal planes and elevations, in order to evaluate the performance of the proposed method we first test it on four indoor scenes. Outdoor scenes and other buildings with sloped faces will be discussed in subsequent subsections. As shown in Figure 12, scene 1 is a stair entrance, and scene 2 is a corridor, both of which were obtained at the School of Surveying and Mapping, Wuhan University, using the terrestrial laser scanner Faro Focus 150. The two scenes are relatively representative, including indoor structures in most occasions, such as floors, walls, doors, windows, ceilings, stairs, handrails, corners, etc., as well as some sundries such as flower pots, trash cans, etc. Scene 3 is collected from the laboratory on the first floor of Wuhan Haida Digital Cloud Co., Ltd. (Wuhan, China). The instrument used is the HS500i terrestrial 3D laser scanner developed by the company. It should be pointed out that the scanning accuracy of this scanner is poor. The surface of the collected point cloud data has a lot of random noise, and the quality is not as good as the first two If θ ij < 5 • or θ ij > 175 • , the two segments are considered to be approximately parallel. Note that 5 • is an empirical value for judging parallel lines. Once the parallelism of two 3D line segments is checked, the corresponding pair will be marked and will not be compared repeatedly.
Second, for each parallel pair, the orthographic distances from the two endpoints of the longer line segment to the other one is calculated. If both distance values are less than 0.05 m, the two segments are considered to be approximately collinear. Note that 0.05 m is an empirical value for judging collinear lines.
Finally, the endpoints of the two 3D line segments are checked to ascertain whether there is overlapping part. The collinear segments with overlapping are considered to be merged. If a line segment is shorter than the other, only the longer one is retained. The rest is output as the final 3D line segment extraction result (see Figure 11b).

Experiments and Analysis
Considering that the plane segmentation involved in this paper are mainly horizontal planes and elevations, in order to evaluate the performance of the proposed method we first test it on four indoor scenes. Outdoor scenes and other buildings with sloped faces will be discussed in subsequent subsections. As shown in Figure 12, scene 1 is a stair entrance, and scene 2 is a corridor, both of which were obtained at the School of Surveying and Mapping, Wuhan University, using the terrestrial laser scanner Faro Focus 150. The two scenes are relatively representative, including indoor structures in most occasions, such as floors, walls, doors, windows, ceilings, stairs, handrails, corners, etc., as well as some sundries such as flower pots, trash cans, etc. Scene 3 is collected from the laboratory on the first floor of Wuhan Haida Digital Cloud Co., Ltd. (Wuhan, China). The instrument used is the HS500i terrestrial 3D laser scanner developed by the company. It should be pointed out that the scanning accuracy of this scanner is poor. The surface of the collected point cloud data has a lot of random noise, and the quality is not as good as the first two scenes. This scene can be used to verify the performance of the proposed algorithm in 3D line segment extraction with noise interference. Scene 4 is selected from a recreation room in Stanford. scenes. This scene can be used to verify the performance of the proposed algorithm in 3D line segment extraction with noise interference. Scene 4 is selected from a recreation room in Stanford. Large 3D Indoor Spatial Dataset (S3DIS): The Large 3D Indoor Spatial Dataset (S3DIS) is created by scanning271 rooms in six areas with a Matterport camera to generate reconstructed 3D texture meshes and RGB-D images. All the algorithms in this paper are implemented using C++ and Point Cloud Library 1.12.1, executed on a PC with Intel Core i5-830H 2.3GHz CPU and 8GB RAM, and the operating system is Windows 10.

Parameters Setting
As we can see from above sections, there are generally two categories of parameters in our algorithm: number-related and distance-related. Specifically, nine parameters are involved in this paper. However, most of them are highly correlated or do not require major adjustments. Below, we provide guidelines regarding how to select these thresholds.
The subsampling threshold represents the minimum threshold between two points, that is, after a point cloud undergoes subsampling, the distance between any two points within it is not less than . Generally, cannot be determined by a fixed value; rather, it depends on the scale of the original point cloud data and the noise level. For point cloud with a relatively large height span range (e.g., greater than 5 m), such as scene 1, is set to 0.1 m; while for point clouds with a relatively small height span range (e.g., less than 5 m) such as scene 2, scene 3, and scene 4, is set to 0.05 m. denotes the distance threshold to determine whether a point lies on a straight line, and is usually associated with the average point distance daver of Pproj. Here, we estimate through statistical analysis, i.e., is set to five times the average point distance daver of Pproj. This value works well on a broad spectrum of datasets.
is the number of iterations when performing RANSAC, which is usually set to 100.
is defined as the farthest distance to determine whether two adjacent points lie on the same line segment. For the sake of convenience, we simply set it to be four times .
is set as the minimum number of points per Large 3D Indoor Spatial Dataset (S3DIS): The Large 3D Indoor Spatial Dataset (S3DIS) is created by scanning271 rooms in six areas with a Matterport camera to generate reconstructed 3D texture meshes and RGB-D images. All the algorithms in this paper are implemented using C++ and Point Cloud Library 1.12.1, executed on a PC with Intel Core i5-830H 2.3GHz CPU and 8GB RAM, and the operating system is Windows 10.

Parameters Setting
As we can see from above sections, there are generally two categories of parameters in our algorithm: number-related and distance-related. Specifically, nine parameters are involved in this paper. However, most of them are highly correlated or do not require major adjustments. Below, we provide guidelines regarding how to select these thresholds.
The subsampling threshold ε 1 represents the minimum threshold between two points, that is, after a point cloud undergoes subsampling, the distance between any two points within it is not less than ε 1 . Generally, ε 1 cannot be determined by a fixed value; rather, it depends on the scale of the original point cloud data and the noise level. For point cloud with a relatively large height span range (e.g., greater than 5 m), such as scene 1, ε 1 is set to 0.1 m; while for point clouds with a relatively small height span range (e.g., less than 5 m) such as scene 2, scene 3, and scene 4, ε 1 is set to 0.05 m. ε 2 denotes the distance threshold to determine whether a point lies on a straight line, and ε 2 is usually associated with the average point distance d aver of P proj . Here, we estimate ε 2 through statistical analysis, i.e., ε 2 is set to five times the average point distance d aver of P proj . This value works well on a broad spectrum of datasets. ε 3 is the number of iterations when performing RANSAC, which is usually set to 100. ε 4 is defined as the farthest distance to determine whether two adjacent points lie on the same line segment. For the sake of convenience, we simply set it to be four times ε 2 . ε 5 is set as the minimum number of points per meter, and ε 5 = 30 works for most scenes. ε 6 is set as the shortest distance of a qualified line segment. ε 6 = 0.2 m is an appropriate value depending on numerical experimentation. ε 7 denotes the radius value when performing edge point extraction, and ε 7 is set to 0.5 times ε 4 . As to the remaining parameters, ε 8 is set to 5, and ε 9 is set to 10 times the average point spacing of P edge . It is worth mentioning that fine-tuning these parameters for different types of point clouds can lead to better results.

Three-Dimensional Line Segment Extraction Effect Evaluation
Considering that the plane segmentation involved in this paper are mainly horizontal planes and elevations, in order to evaluate the performance of the proposed method we first test the building point clouds of four indoor scenes. Outdoor scenes and other buildings with sloped faces are be discussed in subsequent subsections. As shown in Figure 12, scene 1 is a stair entrance, and scene 2 is a corridor, both of which were obtained at the School of Surveying and Mapping, Wuhan University, using the terrestrial laser scanner Faro Focus 150. The two scenes are relatively representative, including indoor structures in most occasions, such as floors, walls, doors, windows, ceilings, stairs, handrails, corners, etc., as well as some sundries such as flower pots, trash cans, etc. Scene 3 is collected from the laboratory on the first floor of Wuhan Haida Digital Cloud Co., Ltd. The instrument used is the HS500i terrestrial 3D laser scanner developed by the company. It should be noted that the scanning accuracy of this scanner is poor. The surface of the collected point cloud data has a lot of random noise, and the quality is not as good as the first two scenes. This scene can be used to verify the performance of the proposed algorithm in 3D line segment extraction with noise interference. Scene 4 is selected from a recreation room in the Stanford Large 3D Indoor Spatial Dataset (S3DIS). The dataset is created by scanning271 rooms in six areas with a Matterport camera to generate reconstructed 3D texture meshes and RGB-D images. Figure 13 shows the 3D line segment structures extracted by the proposed method. In order to display more detailed information, each scene is displayed from two different angles. It can be seen that both large frames (such as floors, ceilings and walls, etc.) and details (such as doors, windows, and stairs, etc.) are recovered well, and the connections between line segments are also well-matched. The multiple constraints in the algorithm filter out a lot of irrelevant debris, so that a clear and concise extraction result can be obtained. The extracted 3D line segments are consistent with the actual contours of the experimental point cloud. As shown in Figure 13g,h, since the original input data are relatively sparse, the distances between points in the extracted line segments are relatively large. If necessary, it can be interpolated based on cubic B-spline curves. In order to preserve the authenticity of the original data as much as possible, this paper does not fit or connect the extracted line segments, thus resulting in the appearance of some "curve segments". Table 1 shows the calculation results of the method in this paper on each dataset. The results show that the method successfully extracts the main 3D line segments within an acceptable time, and the data volume of the four original point clouds is compressed to 0.00753, 0.00304, 0.00518, and 0.01027 times the original, respectively. Although the original point cloud is greatly compressed, the basic frame and detailed information of the scenes is well-preserved, which is undoubtedly beneficial for subsequent applications such as registration, localization, object recognition, etc. Remote Sens. 2022, 14, x FOR PEER REVIEW 17 of 32   In order to further verify the accuracy of the 3D line segment extraction of the method in this paper, the accuracy of partially representative walls, doors, windows, and quadrangular prism structures of the four scenes is evaluated, respectively. The accuracy evaluation index is adopted here, that is, the accuracy is equal to the extracted area divided by the measured area. As shown in Table 2, after comparing the results obtained by the proposed algorithm with the measured data, it can be found that the extraction accuracy for the side lengths of structures such as doors, windows, and large walls is basically controlled within 5 cm. The overall accuracy of the results is basically controlled above 96%, which proves that the proposed algorithm is reliable in precision control. In order to further verify the performance of the method proposed in this paper, it is compared with the recent work of Lu et al. [25]. The comparison results are shown in Table 1, and Figures 13 and 14. As can be seen in Figures 13 and 14, our detection results are at least as good as the method of Lu et al. [25], both of which are able to extract the main frame of a building. Since the first three scenes are all collected by terrestrial 3D laser scanners, the point cloud density distribution is not uniform, while the method by Lu et al. [25] uses region growing and region merging to segment the point clouds into 3D planes. These planes are then projected into corresponding 2D images, and the line segments are extracted from the images and backprojected to the planes. Image-based methods are sensitive to noise and cannot adapt well to uneven distribution of point cloud density. As can be seen from Figure 14, the method of Lu et al. [25] extracts some pseudo-line segments that do not exist, and the extracted 3D line segments are relatively cluttered with large gaps between adjacent line segments. In contrast, the method proposed in this paper can better restore 3D detail information and avoid many misidentification phenomena. As shown in Table 1, the method proposed in this paper is more efficient than the method of Lu et al. [25]. This is because the projection-based plane segmentation strategy mainly processes the point cloud in 2D space, which can quickly and accurately obtain the positions of the elevation and the horizontal planes, so as to realize the rapid segmentation of the planes. In contrast, the method of Lu et al. [25] requires time-consuming operations such as normal vector estimation, region growth, region merging, etc. The efficiency is not as high as that of the algorithm in this paper. Furthermore, the method by Lu et al. [25] extracts more planes and lines than our method because it uses an over-segmentation strategy to over-segment the point cloud into many facets. segments are extracted from the images and backprojected to the planes. Image-based methods are sensitive to noise and cannot adapt well to uneven distribution of point cloud density. As can be seen from Figure 14, the method of Lu et al. [25] extracts some pseudoline segments that do not exist, and the extracted 3D line segments are relatively cluttered with large gaps between adjacent line segments. In contrast, the method proposed in this paper can better restore 3D detail information and avoid many misidentification phenomena. As shown in Table 1, the method proposed in this paper is more efficient than the method of Lu et al. [25]. This is because the projection-based plane segmentation strategy mainly processes the point cloud in 2D space, which can quickly and accurately obtain the positions of the elevation and the horizontal planes, so as to realize the rapid segmentation of the planes. In contrast, the method of Lu et al. [25] requires time-consuming operations such as normal vector estimation, region growth, region merging, etc. The efficiency is not as high as that of the algorithm in this paper. Furthermore, the method by Lu et al. [25] extracts more planes and lines than our method because it uses an over-segmentation strategy to over-segment the point cloud into many facets. To demonstrate the effect of contour point extraction, the proposed method is compared with Bazazian et al. [15], Ioannou et al. [16] and methods based on Statistical Outlier Removal (SOR) filtering in PCL [38], respectively. As shown in Figure 15a,e,i,m, the contour points extracted by the proposed method are clear and complete, consistent with the basic frame of the building. As shown in Figure 15b,f,j,n, Bazazian et al. [15] detected sharpness by analyzing the eigenvalues of the covariance matrix defined by the k-nearest neighbors of each point edge features. This method has better detection effect for areas To demonstrate the effect of contour point extraction, the proposed method is compared with Bazazian et al. [15], Ioannou et al. [16] and methods based on Statistical Outlier Removal (SOR) filtering in PCL [38], respectively. As shown in Figure 15a,e,i,m, the contour points extracted by the proposed method are clear and complete, consistent with the basic frame of the building. As shown in Figure 15b,f,j,n, Bazazian et al. [15] detected sharpness by analyzing the eigenvalues of the covariance matrix defined by the k-nearest neighbors of each point edge features. This method has better detection effect for areas with large curvature changes such as plane intersections, but cannot detect the edges of individual planes (see Figure 15b,f). This type of eigenvalue-based method is particularly sensitive to noise. As mentioned above, the point cloud quality of scene 3 is not as good as the other three, and the surface is filled with a lot of random noise, so many points on the plane are detected by mistake (see Figure 15g). As shown in Figure 15c,g,k,o, Ioannou et al. [16] calculated the normal vector of each point at different scales, and those points whose normal vector differences are greater than the set threshold are regarded as contour points. Similar to the method based on eigenvalues, this method is highly recognizable for sharp edge points, and also cannot detect edge points of a single plane (see Figure 15c,g). However, the robustness of the method proposed by Ioannou et al. [16] is better than that based on eigenvalues. As shown in Figure 15k, although the surface of the point cloud contains a lot of random noise and the density distribution is uneven, the main contour points are still accurately extracted. In short, the method based on normal differentiation can extract the rough outline of the target object, but the extraction effect is not as fine as the method in this paper. As shown in Figure 15d,h,l,p, the method based on SOR filtering has great limitations, the requirements for the quality of point cloud data are high, and the threshold parameters are not easy to accurately estimate. Table 3 shows the calculation results of the four methods for the four scenarios. It can be seen that the method in this paper has high efficiency and high data compression ratio, and is suitable for fast and fine contour point extraction in large-scale high-density building scenes.
(g) (h) Figure 14. Three-dimensional line segments extracted by the method of Lu et al. [25]. Each scene is shown from two angles. To demonstrate the effect of contour point extraction, the proposed method is compared with Bazazian et al. [15], Ioannou et al. [16] and methods based on Statistical Outlier Removal (SOR) filtering in PCL [38], respectively. As shown in Figure 15a,e,i,m, the contour points extracted by the proposed method are clear and complete, consistent with the basic frame of the building. As shown in Figure 15b,f,j,n, Bazazian et al. [15] detected sharpness by analyzing the eigenvalues of the covariance matrix defined by the k-nearest neighbors of each point edge features. This method has better detection effect for areas with large curvature changes such as plane intersections, but cannot detect the edges of individual planes (see Figure 15b,f). This type of eigenvalue-based method is particularly sensitive to noise. As mentioned above, the point cloud quality of scene 3 is not as good as the other three, and the surface is filled with a lot of random noise, so many points on the plane are detected by mistake (see Figure 15g). As shown in Figure 15c,g,k,o, Ioannou et al. [16] calculated the normal vector of each point at different scales, and those points whose normal vector differences are greater than the set threshold are regarded as contour points. Similar to the method based on eigenvalues, this method is highly recognizable for sharp edge points, and also cannot detect edge points of a single plane (see Figure  15c,g). However, the robustness of the method proposed by Ioannou et al. [16] is better than that based on eigenvalues. As shown in Figure 15k, although the surface of the point cloud contains a lot of random noise and the density distribution is uneven, the main contour points are still accurately extracted. In short, the method based on normal differentiation can extract the rough outline of the target object, but the extraction effect is not as fine as the method in this paper. As shown in Figure 15d,h,l,p, the method based on SOR filtering has great limitations, the requirements for the quality of point cloud data are high, and the threshold parameters are not easy to accurately estimate. Table 3 shows the calculation results of the four methods for the four scenarios. It can be seen that the method in this paper has high efficiency and high data compression ratio, and is suitable for fast and fine contour point extraction in large-scale high-density building scenes.   In order to test the performance of this method in outdoor scenes, we selected Birdfountain and Bildstein from the large-scale point cloud dataset Semantic3D for experiments. Before the test, the original point clouds were appropriately cropped to filter out irrelevant content. It should be noted that the Semantic3D dataset uses terrestrial laser scanners to scan scenes such as churches, streets, railway tracks, squares, villages, football  In order to test the performance of this method in outdoor scenes, we selected Birdfountain and Bildstein from the large-scale point cloud dataset Semantic3D for experiments. Before the test, the original point clouds were appropriately cropped to filter out irrelevant content. It should be noted that the Semantic3D dataset uses terrestrial laser scanners to scan scenes such as churches, streets, railway tracks, squares, villages, football fields, and castles, and semantically label over 4 billion points. As shown in Figure 16a,b, the scene Birdfountain is taken from a street, including the street façade, sloping roofs, ground, trees, and a large number of outliers. The scene Bildstein is a Church building with numerous closely spaced facades, sloping roofs, and some arcuate structures. The plane segmentation algorithm provided in this paper is only suitable for elevation and horizontal planes. The advantages are high efficiency, good robustness, and quick processing of large-scale and high-density scene point clouds. Therefore, the proposed method is used to extract the elevation and horizontal planes, and then the improved RANSAC algorithm proposed by Chum et al. [35] is used to detect the inclined plane and some smaller planes within the remaining point cloud (see Figure 16c,d). Then the contour points are identified (see Figure 16e,f) and the 3D line segments are extracted (see Figure 16g,h). Table 4 shows the calculation results of the two outdoor scene data. It can be seen that the method in this paper can efficiently perform plane segmentation, contour point extraction and 3D line segment extraction. While retaining the main structure of the building, the data are greatly compressed. the scene Birdfountain is taken from a street, including the street facade, sloping roofs, ground, trees, and a large number of outliers. The scene Bildstein is a Church building with numerous closely spaced facades, sloping roofs, and some arcuate structures. The plane segmentation algorithm provided in this paper is only suitable for elevation and horizontal planes. The advantages are high efficiency, good robustness, and quick processing of large-scale and high-density scene point clouds. Therefore, the proposed method is used to extract the elevation and horizontal planes, and then the improved RANSAC algorithm proposed by Chum et al. [35] is used to detect the inclined plane and some smaller planes within the remaining point cloud (see Figure 16c,d). Then the contour points are identified (see Figure 16e,f) and the 3D line segments are extracted (see Figure  16g,h). Table 4 shows the calculation results of the two outdoor scene data. It can be seen that the method in this paper can efficiently perform plane segmentation, contour point extraction and 3D line segment extraction. While retaining the main structure of the building, the data are greatly compressed.  In order to test the performance of the method proposed in this paper under different noise levels, as shown in Figure 17, a model named 101-prism in the public dataset Digital Shape Repository was selected for testing. The size of the model is 20.00 m × 19.02 m × 15.00 m, 475,250 points were uniformly sampled from this model, and 0.01 m, 0.03 m, and 0.05 m of Gaussian noise were added, respectively. It can be seen from the extraction effect that the method in this paper has good robustness and can accurately extract the 3D line segment structure under heavy noise.  In order to test the performance of the method proposed in this paper under different noise levels, as shown in Figure 17

Application
The 3D line segment extraction method proposed in this paper can be applied to the coarse registration of point clouds. Although terrestrial laser scanners can obtain fine 3D information on the surface of objects, the point cloud data collected from a single perspective is not enough to cover the entire scene. Therefore, it is necessary to register the point clouds collected from multiple perspectives to the same coordinate system. The process

Application
The 3D line segment extraction method proposed in this paper can be applied to the coarse registration of point clouds. Although terrestrial laser scanners can obtain fine 3D information on the surface of objects, the point cloud data collected from a single perspective is not enough to cover the entire scene. Therefore, it is necessary to register the point clouds collected from multiple perspectives to the same coordinate system. The process includes two steps: coarse registration and fine registration. Fine registration usually adopts the iterative nearest point (ICP) algorithm [39] or its variants [40,41]. At present, there are a large number of coarse point cloud registration methods [42][43][44], these methods calculate transformation parameters by extracting different geometric features (point, line, plane, and specific object). In general, point-based approaches are more general because they are applicable to a variety of scenarios. The methods based on lines or planes are more robust to noise and point density variation. For scenes with man-fabricated objects, extracting line/surface features for point cloud registration is a good choice due to the large number of line and surface features in the scenes. However, most existing line-based or plane-based methods use 3D lines/3D planes for point cloud registration, which means that these methods process point cloud data in 3D space. Considering that it is time-consuming to extract 3D lines/3D planes, 2D-projected line segments are used here for coarse registration of point clouds.
Assume that the point cloud to be registered is P s and the target point cloud is P t , the purpose of registration is to convert P s to the coordinate system where P t is located. Considering the leveling operation is carried out when P s and P t are collected by terrestrial laser scanner, the coordinate transformation mainly involves rotation and translation in X-Y plane and translation along Z axis. The intersection of planes often exists in the point cloud of buildings, and the projections of these planes on the X-Y plane are distributed as intersecting line segments. This paper fully describes the algorithm of how to extract 2D-projected line segments, and does not repeat them here.
After the projection line segments of P s and P t on the X-Y plane are extracted, the corresponding line segment pairs are identified (there may be more than one pairs). Although these line segment pairs have different points and may not intersect (affected by occlusion and parameter settings), their relative relationship remains unchanged, and the parameter settings among different stations are consistent, so there is no point cloud scaling phenomenon. Based on these characteristics, 2D-projected line segments can be used for registration on the X-Y plane.
Suppose the point of intersection of two intersecting lines is p inters (c 1 , c 2 ), and a virtual point on the line with a length r from the intersection is p (x, y), then the following relationship can be obtained: where a 1 and b 1 are the parameters in the rectangular coordinate system. Substituting the first term in Equation (24) into the second term, the value of x can be obtained as: and d 2 between these two virtual points and p cent are calculated. The virtual point with the closest distance is the desired one.
As shown in Figure 18, one intersection point can be used to calculate two additional virtual points, and the rotation and translation parameters of the X-Y plane can be established from these three pairs of homonymic points. There may be multiple line segment pairs, that is, there may be multiple pairs of homonymy points (more than three pairs). In order to solve the coordinate transformation in the presence of multiple pairs of homonymy points, p i t and p i s are the homonymy point sets of P t and P s , respectively, the rotation matrix is R, and the translation vector is T. The following relationship can be obtained: By performing singular value decomposition on H, the following relationship can be obtained: Then the rotation matrix R can be expressed as: After calculating the value of rotation matrix R, the translation vector T can be obtained from Equation (26): The transformation matrix A between the two coordinate systems is: Next, the translation of P s and P t along the Z-axis direction ∆Z is calculated. A simple way is to calculate the average elevation h1 and h2 of the flat ground of P s and P t . The translation can be regarded as the difference between h1 and h2. The centroids of p i t and p i s is p cent t and p cens s respectively, then we have: Considering the influence of error, the method of reducing error function is adopted to solve the transformation parameters. The total error equation during coordinate transformation can be expressed as: where n is the number of homonymic point pairs. The total error equation after centralization can be expressed as: when the value of ∆ is minimum, the desired rotation matrix R and shift vector T can be obtained, and appropriate transformation of ∆ is performed: Therefore, finding the minimum of ∆ is equivalent to finding the maximum value of ∆', and ∆' is: where By performing singular value decomposition on H, the following relationship can be obtained: Then the rotation matrix R can be expressed as: After calculating the value of rotation matrix R, the translation vector T can be obtained from Equation (26): The transformation matrix A between the two coordinate systems is: Next, the translation of P s and P t along the Z-axis direction ∆Z is calculated. A simple way is to calculate the average elevation h 1 and h 2 of the flat ground of P s and P t . The translation can be regarded as the difference between h 1 and h 2 .
As shown in Figure 19, the Faro Focus s 150 laser scanner was used to collect point clouds from two different perspectives in the library square of the Faculty of Information Science, Wuhan University. The scanner was set up on a flat cement floor in front of the library, and the point number of the two sites were 15,906,154 and 15,782,158, respectively. As shown in Figure 20, Area1 and Area2 are selected corresponding to the two sites, of which the red part is Area1 and the blue part is Area2. The purpose of registration is to unify Area2 to Area1 coordinate system. Area1 and Area2 are subsampled and projected to the X-Y plane, the subsampling parameter is set to 0.03 m, the 2D projected point cloud line segment is extracted (Figure 21), and five pairs of intersection points are found. Two pairs of homonymous points can be added to each pair of intersection points, so that the pairs of homonymous points are 15 in total, where r is set to 1 m. The coordinate transformation matrix calculated from the 15 pairs of homonymous points is as follows: where Equation (36) transforms the coordinates of P s so that it is aligned with P t on the X-Y plane. As shown in Figures 22 and 23a, after coordinate transformation, point clouds are well-aligned on the X-Y plane.
where Equation (36) transforms the coordinates of P s so that it is aligned with P t on the X-Y plane. As shown in Figures 22 and 23a, after coordinate transformation, point clouds are well-aligned on the X-Y plane.
where Equation (36) transforms the coordinates of P s so that it is aligned with P t on the X-Y plane. As shown in Figures 22 and 23a, after coordinate transformation, point clouds are well-aligned on the X-Y plane.      However, as shown in Figure 23b, the two registered point clouds still have elevation deviation in the Z direction. It is known that the average ground elevation near the scanner of P t and P s is −175.084 m and −180.901 m, respectively. The final registration result can be obtained by shifting P s upward by 5.817 m (see Figure 24).   However, as shown in Figure 23b, the two registered point clouds still have elevation deviation in the Z direction. It is known that the average ground elevation near the scanner of P t and P s is −175.084 m and −180.901 m, respectively. The final registration result can be obtained by shifting P s upward by 5.817 m (see Figure 24). However, as shown in Figure 23b, the two registered point clouds still have elevation deviation in the Z direction. It is known that the average ground elevation near the scanner of P t and P s is −175.084 m and −180.901 m, respectively. The final registration result can be obtained by shifting P s upward by 5.817 m (see Figure 24).  Table 5 shows the coordinate deviations of 15 homonymous points after registration. The maximum deviation in the X direction is less than 3 mm, the maximum deviation in the Y direction is less than 2.5 mm, and the maximum displacement deviation is less than 3.5 mm.   Table 5 shows the coordinate deviations of 15 homonymous points after registration. The maximum deviation in the X direction is less than 3 mm, the maximum deviation in the Y direction is less than 2.5 mm, and the maximum displacement deviation is less than 3.5 mm.

Conclusions
In this paper, we presented an efficient 3D line segment extraction method for building point clouds. The raw point clouds are first segmented into vertical and horizontal planes via a projection-based plane segmentation method. Since most operations are performed in 2D space and no normal estimation is required, the proposed plane segmentation method is very efficient compared with traditional segmentation algorithms such as Region Growing and RANSAC. Then these extracted planes are projected onto their corresponding precisely fitted planes. The boundary points of the flat planes are abstracted by an adaptive α-shape algorithm. Finally, the proposed 3D line segment detection method based on RANSAC and Euclidean clustering is employed to abstract 3D line segments. Compared with other line detection technologies such as Hough transform and CannyLines, the proposed method can accurately extract both 2D and 3D line segments without model adjustment. The experiments were performed on the raw point clouds of complex realworld scenes acquired by terrestrial laser scanner devices and structured-light sensors. An image-based 3D line segment detection algorithm is compared with the proposed method. The comparison results show that the proposed method can restore the details better and recover the structural line segments more concisely. The comparative experiments on boundary point extraction show that the method is superior to the point-based method. The performance of extracting 3D line structures from synthetic point clouds with different levels of Gaussian noise shows that the proposed method is robust to the presence of noise. Moreover, the proposed framework produces minimal outliers and pseudo-lines, as suggested by comparisons with the state-of-the art approaches. In addition, the proposed method can be further optimized by a parallel processing technique. We believe that our 3D line segment extraction method can be applied to many fields. As an example, we employ the extracted 2D-projected line segments on solving the building point cloud registration problem. The proposed line-based registration method is reliable and efficient, especially suitable for the man-fabricated structures, which contains a large number of line and plane features.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.