Automatic Extraction of Indoor Structural Information from Point Clouds

: We propose an innovative method with which to extract building interior structure information automatically, including ceiling, floor, and wall. Our approach outperforms previous methods in the following respects. First, we propose an approach based on principal component analysis (PCA) to find the ground plane, which is regarded as the new Cartesian plane. Second, to reduce the complexity of data processing, the data are projected into two dimensions and transformed into a binary image via the operation of an improved radius outlier removal (ROR) filter. Third, a traditional thinning algorithm is adopted to extract the image skeleton. Then, we propose a method for calculating slope through the nearest neighbor point. Moreover, the line is represented with the slopes to obtain information pertaining to the interior planes. Finally, the outline of the line is restored to a three-dimensional structure. The proposed method is evaluated in multiple scenarios, and the results show that the method is accurate (the maximum error of 0.03 m was in three scenarios) in indoor environments.


Introduction
In recent years, LiDAR (light detection and ranging) technology has developed rapidly due to its high accuracy, low cost, portability, and wide application range such as in autonomous driving [1][2][3][4][5], military fields [6][7][8][9], aerospace [10,11], and three-dimensional (3D) reconstruction [12][13][14]. In terms of 3D modeling, high-precision and high-density point cloud data are provided by LiDAR to accurately restore the surface model of an object that could be a trunk [15], a geological landform [16], or a building [17,18]. With the maturation of indoor navigation [19] technology, it is important to obtain precise building interior structures from the point cloud for accurate navigation. However, it is difficult and time consuming to extract indoor structures from the disorder and high density [20] of point clouds, and the complexity of data processing is greatly increased due to the noise caused by the algorithm and the scanning environment. Therefore, completing interior reconstruction is challenging under the influence of these negative factors.
To reconstruct the interior structure accurately, researchers have proposed advanced ideas [21,22] and frameworks. First of all, the raw data of the point cloud mainly come from a laser scanner. Valero et al. [23] used a laser scanner to attain neat point clouds and different algorithms to identify different objects. The point cloud from the laser scanner is precise and the point cloud that makes up the plane is projected vertically as a straight line, which results in great convenience in the subsequent processing. However, it is expensive and takes more time to measure. Therefore, we use LiDAR because the data from cheap and lightweight LiDAR are easily available and convenient. In some large-scale scenes, the data measured by LiDAR are integrated by a scan registration algorithm such ground in the previous step and convert it into images rather than slicing. To remove the sparse noise and improve the filtering speed, we optimize the radius filtering without changing the effect. Next, coordinates are enlarged and meshed to convert them into pixels. After conversion to an image, the width of the line is several pixels, which is not conducive for us to extract the structure, so we consider extracting the image skeleton using the thinning algorithm. Saeed et al. [44] proposed an image-thinning algorithm that can extract image skeletons effectively; however, this algorithm is not sensitive to corner points. Zhang and Suen [45] proposed a method that could quickly and accurately extract an image skeleton and reflect image corner information by setting several constraint conditions through the relationship between the target pixel and eight-neighborhood pixels. Therefore, the method [43] is more suitable for our data. Due to the error in scan registration, the intersection of two lines produces a radian instead of a right angle. Corner detection in our data is difficult with traditional corner detection operators such as the Harris operator [46] and the SUSAN operator [47]. Chen et al. [48] proposed to use the curvature of each point as a constraint to screen corner points, which could not find corner points very accurately. By observation, we find that the intersection of adjacent lines can be used as the exact value of corner points. As such, we propose a judgment method of a straight line based on slope and select corners from the intersection points of straight lines. Finally, we restore the three-dimensional model according to corner points and corresponding lines. The entire flowchart is shown in Figure 1.
The rest of this paper is organized as follows. Section 2.1 presents the principle of PCA and the framework that separates the walls based on PCA. In Section 2.2, we introduce how we can transform data into images, including point cloud projection, improved radius filtering, and data regularization. Section 2.3 presents the principle of the slopebased algorithm and some detailed processing and optimization. In Section 2.4, we organize the expression of straight lines and restore the three-dimensional structure of the interior environment. In Section 3.1, we introduce our equipment and the data extraction process. In Section 3.2, we use three different scenarios to verify our method and give the relevant evaluation parameters of three experimental results. Section 4 shows our discussion of the experimental results, including the advantages, disadvantages, and possible subsequent optimization of the experimental method. Finally, we summarize the whole paper and look ahead to future work in Section 5.

Methods
In this section, we detail the principle of our approach in four steps: (1) A wall extraction method based on PCA; the data are divided into three sections: floor, ceiling, and other data. (2) Conversion from 3D to 2D data, including projection, filtering, and voxelization. (3) Extraction of line segments based on the image; the image is skeletonized and we propose a line segment-extraction algorithm based on the slope of adjacent points. (4) Reconstruction of the indoor structure; we reconstruct the interior structure using linear information.

Extraction of Wall Based on PCA
PCA can be used to extract the principal components of data, reduce the dimensionality of data, or fit planes or lines. In this section, we use the theory of PCA to find the floor and ceiling of the data and solve the problem that the floor may not be the x-y plane in the x-y-z frame. ,,  v is taken as a normal vector to that plane. Finally, we project the raw 3D data to obtain the 2D data by matrix operation Our first step is the process of finding the plane based on PCA. The main flow of this method is shown in Figure 2 and Algorithm 2.
is used as the raw data for Algorithm 2.  Figure 2. Wall extraction method based on PCA: PCA is used to fit the plane for the raw data, and the data are divided into two parts by the plane: ceiling data and ground data. The two parts of data are projected into two-dimensional data by PCA, and then the grid threshold method is used to find the feature points of the ground and ceiling. PCA is used to fit the feature points to obtain the level of the ground and ceiling. Finally, thresholds are set for the two planes and the point clouds of the floor and ceiling are removed.
Some edge problems are caused by rounding the number of points. Therefore, we take some measures as follows: ,, of each point, we replace 0 with 1 and replace the points that are larger than grid n with grid n . The data whose number corresponds to one point are taken as the plane points. However, due to the interference of noise, we use the SOR filter to obtain the exact plane points. These points are fitted into a plane by PCA, and 1 P and 2 P correspond to the plane: : Since the ceiling and floor are parallel, we take the average of the normal vectors of the two planes as our new normal vector: x y z D

:
: Finally, we set 1 threshold and 2 threshold of the distance from the point to the planes to determine the ceiling c S and the floor g S , and the wall w S is the data r S except for c S and g S . The entire process is shown in Table 1. Table 1. The process of extracting the wall: Project the initial data onto a plane, and then the ground data are screened by the grid threshold method. The ground is restored to a three-dimensional form and treated with the SOR filter. The ground and ceiling are fitted into a plane; then, the wall data are extracted by setting thresholds.

Ceiling Ground
raw data 2D project grid threshold method 3D recovery SOR fit plane extract the wall

Data Format Conversion
The point cloud data are so dense and disordered that it is difficult to extract information from them. However, point cloud slicing leads to the loss of the contour, which increases the difficulty of subsequent processing. Therefore, to improve the efficiency of the algorithm, we projected the wall onto the ground and meshed it. The entire process is shown in Figure 5 and Algorithm 3.
wall data project 2D-wall on the ground plane grid radius outlier removal filter data normalization expand the coordinates binary image Figure 5. The process of data conversion. The wall data are projected on the ground plane, which was fitted in the previous section. The two-dimensional data are processed by a grid radius outlier removal (GROR) filter, and the coordinate values are enlarged by appropriate multiples. Finally, the data are normalized and transformed into a binary image.
is used as the raw data in Algorithm 3, and w S is projected into 1 p S on the ground plane. Due to the influence of noise, the points of a line have a certain width, and we extract the line features from these points. First, we need to remove the noise of the two-dimensional points, as shown in Figure 6. noise noise l Figure 6. The noise in the points. We need to extract the line l and remove the noise.
The radius outlier removal (ROR) [49] filter is an effective tool for noise removal. If the ROR filter is not optimized in any way, this method is very inefficient. Therefore, we use the GROR (grid radius outlier removal) filter to optimize the ROR filter. We mesh the data and then search for the nearest neighbor points in a specific way. The results show that the efficiency of this method is greatly improved without affecting the accuracy of the ROR filter. At the same time, due to the high density of the vertical wall point and the low density of other sundries points after projection, some sundries points are removed by the GROR filter. The principle of this method is shown in Figure 7.
Since the mesh size grid grid nn  is fixed, edge suppression is added as follows: Finally, the points whose number of neighbors is less than 3 threshold are removed. After removing the noise, we need to expand the coordinates of the points 2 p S . We chose a large multiple a n to make the details of the data more obvious as follows: For each point ( ) ,  , we subtract the minimum value ( ) xy min min , of the point to make sure the points are positive and use the multiple a n to expand the coordinate of these points. Since the coordinates need to be rounded after expanding, and then restored after processing, the details can be better reflected and the error can be reduced by a larger multiple. Assume a point is . The error before and after treatment was less than 0.01.
Then, we use the grid subsampling method to round the coordinates and eliminate the repeated coordinates of the same position. The benefit of this method is that the data become regular and the amount of data is greatly reduced. The principle of the method is shown in Figure 8. x y S ,  , we need to replace 0 with 1 as follows: Since the coordinates of the data p S are all positive integers, the data can be converted into images to prepare for the next step. This process is shown in Figure 9. The size Additionally, the pixel of the position corresponding to p S in the image image S is replaced by  : where the parameter  is against the background; for example, the parameter  is 0 and the background is 1 in the image.
Therefore, we obtain two sets of data: the two-dimensional points with a grid structure and the corresponding images of the two-dimensional points. The shape of the raw data is retained and the scattered points are regularized by the two data. The experimental procedure in this section is shown in Table 2.  Table 2. Projection and transformation of data. First, the raw data are projected onto the ground plane. Then, the noise is removed by the GROR filter. We enlarge the coordinates 70 times and the grid subsampling method is used to make the data regular. The regular data can be converted to an image, which is prepared for the next section.

Result Details
the raw data are projected to the ground GROR filter grid subsampling method (the coordinates are magnified 70 times) convert to the image

Extraction of Line Segments
For the image in the previous section, we need to extract line segment information from it. Since straight lines are composed of points with a certain thickness in the data, it is difficult to extract multiple lines directly from the data. Therefore, we first use the thinning algorithm to extract the skeleton of the image. Then, the corresponding slope of each point is calculated based on the nearest neighbor point. Finally, we propose a slope-based search method to extract the line segments. We give the flow of the algorithm as shown in Figure 10 and Algorithm 4. The image image S is the input in Algorithm 4. We assume that the background is 0 and the points are 1 in the image, and we convolve the image image S with a kernel to remove the burr, which is influential in extracting the skeleton. Then, the image image S is updated by the convolution value con S : The pixel value is updated by comparing the convolution value con S with the number 4, as shown in Figure 11.  Figure 11. A simple method to remove the burr. We convolve the image and update the pixels of the image based on the convolution values.
To better reflect the contour features of the wall, we consider refining the image. If the pixel representing the line is refined into a single pixel, the feature of the line is more obvious in the skeleton. Therefore, we use a suitable thinning algorithm to extract the twodimensional skeleton information of the wall through experiments. The results are shown in Table 3. Table 3. Image skeleton extraction. The contour of the image is extracted by the skeleton extraction algorithm and lines are refined into single pixels. Similarly, the skeleton made up of two-dimensional points is reduced by the image skeleton.

Data Details
raw image image skeleton points skeleton The image skeleton s S is a single-pixel structure; we find that the characteristics of the lines are obvious, and the corners of the data affected by the error have a certain radian. Therefore, it is very difficult to extract the exact corners from the data; thus, we consider extracting the lines and determining the corners from the lines. We propose a line extraction method based on the nearest neighbor slope, and then introduce this method in detail.
First, the random point ( ) is the starting point; its nearest neighbor ( ) 11 ii xy , is found and the slope of the two points is calculated. In this section, the neighbor points are all found in the 8 neighborhoods. If there are multiple nearest neighbor points, select one as the nearest neighbor point at random. Then, their slope k is calculated as follows: The slope of these two points is represented by the slope k . Since the pixel positions in the image are all integers, there are only four cases for the slope of two adjacent points: 0, 1, −1, and the slope that does not exist is replaced by 2, as shown in Figure 12.  Figure 13. The process of calculating the slope. First, seed 1 is randomly selected in the image. Seed 1 has multiple nearest neighbor points and picks one at random to calculate the slope value 0. The values of seed 1 and neighbor 2 are 0. Then, neighbor 2 is the next seed 1. The points for which the slope has been calculated are not searched later. Seed 1 stops growing if the current point has no neighbors. Then, seed n is randomly selected and the same growth process is carried out until all the pixels of the image are calculated.
After calculating the slope at all points, we need to eliminate the fault in the line, as shown in Figure 14. For each point, if it only has two neighbors and the two neighbors have the same slope k , update the slope of that point to k . When the slope of all the points is calculated, we look for lines based on the slope. The idea is to start at one endpoint of each line as a seed point and grow until the line is completely extracted. Therefore, we first need to look for endpoints as seed points. We use the following three conditions to determine the seed point ( ) When all the lines are categorized, we need to set thresholds to eliminate non-line points. Due to our conditions applied to individual points and short line segments, lines containing fewer points are removed. Since the same line is divided into sections by noise, we propose a method to solve this problem. As shown in Figure 16 For these lines, due to the points that make them up not being in a line, we need to figure out their slopes and endpoints to update the lines. The principle of updating the endpoints is shown in Figure 17. We know the parameter of the line i l : two endpoints ,, ,, If the slope is 0, we obtain new endpoints as follows: ,, If the slope is 2, we obtain new endpoints as follows: ,, ,, Therefore, the line is determined by the midpoint and the slope, and the range of the line is determined by the new endpoints. Finally, we show the process of finding the straight lines and drawing them all, as shown in Table 4. If the slope is 0, the abscissa of the new endpoints is the same as the raw endpoints, and the ordinate is the same as the midpoint. If the slope is 2, the abscissa of the new endpoints is the same as the midpoint, and the ordinate is the same as the raw endpoints.

Structure Reconstruction
In this part, we optimize the lines and reduce them to three-dimensional planes. In the last section, due to the error of scan registration, we removed the smooth intersection of the adjacent segments. Therefore, we propose a method to connect the lines. The basic steps of this method are as follows: 1. Assuming that the known condition is the endpoint 1 p , the nearest endpoint 2 p of the other lines is calculated. Additionally, we then calculate the distance d between the two endpoints. 2. The distance threshold is set to avoid two lines that are far apart from each other being connected. Then, we compare the distance d to the threshold. 3. If the distance is less than the threshold and the two lines are perpendicular, the lines corresponding to the two endpoints are connected and the endpoints The results are shown in Figure 18. To better demonstrate the accuracy of the experimental results, we spliced and compared the lines with the raw data, as shown in Figure  19.  Comparison of experimental results. The yellow line is the result that we obtain and the black points are the raw data. We obtain the basic outline of the data, and the reason why some parts are not considered lines is that the points are discontinuous or the slope changes are complicated.
Finally, we reconstruct the three-dimensional plane from the resulting lines. Now, we need to figure out the height of each line. In the previous section, we calculated the ground plane and calculated the height of each point above the ground. Before we achieve this, the endpoints are restored by Equation (10). We calculate the nearest neighbors of the midpoints of each line segment and choose the height of the line as the maximum distance between these neighbors and the ground. If the height is close to the distance between the ceiling and the floor, the height changes to that distance. The result is shown in Figure 20. At this point, we have introduced the principle of the whole algorithm in detail. In the next section, we will evaluate the performance of the algorithm with multiple sets of data.

Preparation of Experimental Data
The equipment used to collect data is shown in Figure 21. The experimental equipment was a backpack system integrating LiDAR, an inertial measurement unit (IMU), a wireless module, and an upper computer. The surroundings are sensed by a 16-line mechanical LiDAR at a frequency of 0.5 s and surface information about surrounding objects is scanned by the LiDAR. The IMU is used to measure the attitude information of LiDAR and we fuse it with each frame point cloud by the robotic operating system (ROS) in the upper computer. The processed data are transmitted to a computer for display through the wireless module.  Figure 21. The experimental equipment. It is a backpack system integrating LiDAR, an inertial measurement unit (IMU), a wireless module, and an upper computer. We adopted LeGO-LOAM to implement the scan registration of data. In the algorithm, the attitude z roll pitch t ,,    is obtained from the ground features, and the attitude x y yaw tt ,,    is calculated according to the remaining point cloud features. Therefore, the matching error is caused by too few ground features, and another error is caused by the Euler angles turning too fast in a short time. In the following experimental results, these two aspects may be the causes of the error. Finally, we present some important parameters for the next three scenarios: the number fac n of grids for the floor and ceiling is extracted in Section 2.1. Radius GROR r and threshold GROR n of the GROR filter, magnification moc n of the coordinates, the distance p d that connects parallel lines, and the distance v d between the vertical lines are shown in Table 5.

Results
In this section, we use three real scenarios to test our algorithm. We carried the equipment on our backs and the LiDAR was about 1.8 m away from the ground. We followed a fixed trajectory and tried to keep the posture of the device stable as we moved. Finally, three data points are prepared: a room, a floor, and some rooms.

Experiment Scene 1
We extracted the structure of a single room in the first data point; the room was about 7.69 m  6.35 m  3 m (length  width  height). The experimental scene is shown in Figure 22a, and the two-dimensional plan of the scene is shown in Figure 22b. To better show the interior structure, we removed the ceiling in the second section. The measured point cloud data are shown in Figure 22c, and the final result is shown in Figure 22d. It can be seen from the experimental results that the basic structure of the room has been restored, but we can see that there are some flaws in the results. There are gaps between some adjacent planes. In the previous sections, we joined planes if they were adjacent and the distance between them was less than the threshold. In real scene 1, due to the glass between the curtains not being sensed by the LiDAR, we kept the gaps in the result. At the same time, the large gaps due to measurement errors were preserved.

Experiment Scene 2
We chose the floor of the school building as the second scenario to test the method. The point cloud we collected is shown in Figure 23a and the result of algorithm processing is shown in Figure 23b. Similarly, we present the two-dimensional plan of the floor in Figure 23c, and we provide real photos of the two positions on the floor and the corresponding reconstruction results, as shown in Figure 23d-g. Comparing the results with the real scene, we can see that all the walls are well restored. At the time of measurement, we did not enter any rooms. Some of the doors were open and others were closed, which is why some of the doors were empty in our results. At the same time, due to the attention to detail in our method, uneven sampling and the error of the matching algorithm may result in faults in the same plane.

Experiment Scene 3
In scene 3, we chose an interior environment with multiple rooms. The raw data are shown in Figure 24a, and the result is shown in Figure24b. We drew the two-dimensional plan of the data as shown in Figure 24c and a comparison of the result with the real world as shown in Figure 24d,e. We entered most of the rooms during the measurement and all the doors were closed. Due to the complexity of the scene corresponding to the data, there was a major difference between the two frames when the device entered the room, which led to a major error in scan registration. Therefore, the error of our results is larger than that of the first two data points. However, all the rooms were restored well, so the result was satisfactory.

Relevant Parameters of the Experiment
We aim to obtain indoor structural information in a fast and efficient way, so we list running time, data volume, and distance parameters. In our approach, we employed a series of ways to reduce the amount of data. Therefore, on the premise of ensuring the authenticity and validity of parameters, and to demonstrate the efficiency of our method, we calculated the time of the following parts in Table 6 Table 7. To verify the accuracy of the algorithm, we compare it with the actual data using the following aspects in Table 8 and  , and the mean precision  . In Table 10, we list the characteristics of different reconstruction methods in recent years. The function and purpose of each method are not completely the same. To reflect the innovation of our method and solve the problem, we evaluate it while considering the following aspects: (1) sources of data; the effect of the algorithm can be affected by different data; (2) ground calculation; the ground is found as the base by our method without any conditions such as LiDAR attitude; (3) filtering; if no filtering is used, the algorithm has strong robustness; (4) data optimization; data optimization shows that the algorithm has high processing performance; (5) the manifestation of doors and windows; the details show the precision of the algorithm; (6) accuracy of the number of planes; if all planes are found, the accuracy of the algorithm is high.   Table 9. Parameter comparison between the actual scenario and the predicted scenario.

Discussion
First, we evaluated the experimental results of the three scenarios. In scenario 1, all vertical planes of the room are created and the basic structure is restored. Because of the curtain, the curved part is removed by our algorithm, which leads to gaps in the plane formed by the curtain. In our algorithm, we expand the coordinates of the data, and the uneven sampling areas and the walls with windows contain fewer points. These areas are removed by our method. Therefore, our method can better restore the details of the room by enlarging the coordinates, while the room is restored to a rectangle if normal processing is used. In scenario 2, due to the large area of the scene, we stretched the result vertically for better detail. We can see that all the open doors are identified as gaps between the planes, and by comparing the coordinates in the result, we can ensure that the plane precision is high; for example, the width of the door is about 1 m. The whole floor is well restored, although there are still a few errors in the plan. This is where our method will need to be optimized at a later stage. In scenario 3, we went into some rooms selectively, and the results show that the reconstruction is better. All the rooms are perfectly reproduced, though there are some errors due to scan registration.
Second, we analyze and discuss the parameters in five tables. In Table 6, although we optimized the filtering, we can see that most of the time is still spent on filtering and voxelization. However, in terms of the time and effect of restoring the image to a threedimensional plane, the time ir t − of about 2 s shows that our slope-based algorithm is efficient and fast. Therefore, we will try other filtering approaches to improve the performance of the program in subsequent work. In Table 7, we can see a gradual decrease in the amount of data, such as in scene 3, the number of points is reduced to 187  52 from 1,568,974  3, which is why our approach is efficient. In Table 8, by comparing the parameters of the ceiling and ground planes, we note that they are parallel. The error of the room height calculated from these two planes is small (the maximum error does not exceed 0.015), so our method of finding the plane is accurate. We want to restore a model from the point cloud that is the same size as in the actual scene. In Table 9, by comparing the calculated length, width, and height with the actual value, the error between the predicted value and the actual value is, respectively, less than 0.044 m, 0.045 m, and 0.015 m. The mean error of 0.024 m proves that our algorithm is accurate within the error tolerance.
In Table 10, we overcome the difficulty of LiDAR data processing and achieve similar results compared with some current methods. The data from LiDAR are easy to obtain but difficult to process, laser scanner data are difficult to obtain but high precision. We summarize the advantages and disadvantages of some algorithms. Then, we propose some methods to optimize and solve these problems in terms of prior knowledge, noise removal, data optimization, structure extraction, algorithm efficiency, and accuracy. Compared with existing methods, our method is superior to current algorithms in some respects.
Finally, there are still some shortcomings in our experiment. To make the initial data tidier, we will test other algorithms to replace the current algorithm for scan registration, such as LIO-SAM. Our method is currently only applicable to the reconstruction of interior vertical walls, and circles are recognized as polygons by the method. Therefore, we will modify the algorithm to accommodate more complex cases and make it more robust.

Conclusions
The method proposed in this paper can restore basic interior structure on an equal scale. In this method, the ceiling and the floor of the room are segmented and fitted, and the ground plane is used as the reference plane. Then, the wall is projected onto the reference plane. We filter, regularize, and transform the resulting two-dimensional data into an image. We extract the skeleton of the image and calculate the slope for each point. The slope that characterizes a line expands by growing until all points of the line are found. These lines are optimized and corrected by our method to produce more accurate lines. Finally, these lines are restored to three-dimensional vertical planes by applying height information. From the experimental results, the reconstruction effect and parameter analysis all show that our method is accurate (mean error of 0.024 m at an average measuring range of 40 m). In the whole algorithm, we pay attention to the improvement of computing speed and innovation of the algorithm. Therefore, we made a lot of data optimization and algorithm improvement, and we proposed a new framework to solve the problem of indoor reconstruction. However, there are some limitations and errors in this method. For example, parameters need to be adjusted to obtain the best results, and indoor sundries cannot be reconstructed. Therefore, we will focus on algorithm optimization and detail processing to make the algorithm more practical in the future.

Conflicts of Interest:
The authors declare no conflict of interest.