Exploration-Based SLAM (e-SLAM) for the Indoor Mobile Robot Using Lidar

This paper attempts to uncover one possible method for the IMR (indoor mobile robot) to perform indoor exploration associated with SLAM (simultaneous localization and mapping) using LiDAR. Specifically, the IMR is required to construct a map when it has landed on an unexplored floor of a building. We had implemented the e-SLAM (exploration-based SLAM) using the coordinate transformation and the navigation prediction techniques to achieve that purpose in the engineering school building which consists of many 100-m2 labs, corridors, elevator waiting space and the lobby. We first derive the LiDAR mesh for the orthogonal walls and filter out the static furniture and dynamic humans in the same space as the IMR. Then, we define the LiDAR pose frame including the translation and rotation from the orthogonal walls. According to the MSC (most significant corner) obtained from the intersection of the orthogonal walls, we calculate the displacement of the IMR. The orientation of the IMR is calculated from the alignment of orthogonal walls in the consecutive LiDAR pose frames, which is also assisted by the LQE (linear quadratic estimation) method. All the computation can be done in a single processor machine in real-time. The e-SLAM technique leads to a potential for the in-house service robot to start operation without having pre-scan LiDAR maps, which can save the installation time of the service robot. In this study, we use only the LiDAR and compared our result with the IMU to verify the consistency between the two navigation sensors in the experiments. The scenario of the experiment consists of rooms, corridors, elevators, and the lobby, which is common to most office buildings.


Introduction
Navigation is the basic function of autonomous vehicles in daily and industrial applications. For outdoor environment, vehicle localization and navigation can be achieved with the assistance of global positioning systems (GPS). However, in the indoor environment, the traditional navigation technology will fail due to the lack of satellite positioning signals [1]. Vehicle cruising in the indoor space can be localized and navigated through some characteristic graphics or marked points if there is a premade map of the space. If it is moving in an unknown environment, the localization and mapping must be carried out simultaneously. However, accurate localization requires an accurate map and vice versa. The localization and mapping must be concurrently performed which creates a complex problem called simultaneous localization and mapping (SLAM) first proposed by Hugh Durrant-Whyte and John J. Leonard [2][3][4].
Sensors are the way robots or self-driving vehicles perceive their environments. The selection and installation of sensors determine the specific form of observation results, and also affect the difficulty of SLAM problems. According to the major sensor, simultaneous localization and mapping can be divided into Visual-SLAM and LiDAR-SLAM.
Visual SLAM uses the vision camera as the major sensor to find features in the image and then use these features to match the known image for localization. Inspired by human estimate trajectory and map from the input feature tracks. Cartographer is a Google opensource project developed in 2016. It is a system that provides 2D and 3D real-time SLAM under multiple platforms and sensor configurations. The main theory of Cartographer is to eliminate the cumulative error during mapping through loop closure detection.
Regarding the map construction of SLAM, there are three commonly used maps [42]: Occupation grid, Feature-based map and Topological map.
Grid map: The most common way for robots to describe the environment map is a grid map or an occupancy map. A grid map divides the environment into a series of grids, where a value is given in each grid to represent the probability of being occupied [43,44].
Topological map: It is a relatively more abstract map form. It expresses the indoor environment as a topological structure diagram with nodes and related connecting lines. The nodes represent important locations in the environment (corners, doors, elevators, stairs, etc.). Edge represents the connection relationship between nodes, such as corridors. This method only records the topological link relationship of the environment. For example, when a sweeping robot wants to clean a room, such a topological map will be established [50][51][52][53][54][55].
Regardless of Visual SLAM or LiDAR SLAM, most systems need auxiliary data for vision or LiDAR through other sensors such as inertial measurement unit (IMU) or odometer, otherwise the SLAM system will be difficult to operate. The multi-sensor fusion SLAM increases the costs of both equipment and calculation. The operation of this kind of system is also a considerable burden. This study proposes a set of algorithms that uses only LiDAR signals to perform SLAM. The algorithm can quickly find mutually perpendicular walls in space to be referenced through simple geometric rotation features. It is unnecessary to compare the original building layout to complete the localization and map construction of rooms, floors and buildings. The equipment is relatively simple and the amount of required calculation is greatly reduced compared with previous methods. It can be used as a real-time SLAM in the unexplored building.

LiDAR Point Cloud
LiDAR originated in the early 1960s, shortly after the invention of the laser, by emitting and receiving laser beams, LiDAR uses the time-of-flight (ToF) method to obtain an object's position and its characteristics. This method is almost independent of illumination conditions, has a long detection time range and high accuracy. Its first application came from meteorology, where the National Center for Atmospheric Research used it to measure clouds and pollution [56]. There is a growing interest in portable and affordable three-dimensional LiDAR systems for new applications [57][58][59][60][61]. Many examples of rotating single-beam LiDAR can be found in the literature [62][63][64][65]. The addition of a degree-offreedom to build a rotating multi-beam LiDAR has the potential to become a common solution for affordable rapid full-3D high-resolution scans [66][67][68][69][70].
The LiDAR scanner rotates at a constant speed about the rotary axis, which is defined per convention as theẐ axis, and sends an array of rays. The number of rays is denoted as the number of rings, which are shot from the LiDAR center outward into the space in different angles denoted as the ring angle Ψ. The constant speed of rotation of the linear array results in the same laser ray to form a ring in the space and the speed of rotation that multiply the sampling time of sensing is known as the azimuth angle of resolution. It can be observed from Figure 1a that the LiDAR ray points are projected on the walls according to its rays in different ring and azimuth angles. The rays with the same ring angle Ψ n but different azimuth angles form a cone surface as shown in Figure 1b. A nonlinear curve can be determined when the cone surface intersects a flat wall. The rays with the same azimuth angle α a but different ring angles form a planar surface as shown in Figure 1c. A vertical line can be detected when the planar surface intersects a flat wall. same azimuth angle but different ring angles form a planar surface as shown in Figure  1c. A vertical line can be detected when the planar surface intersects a flat wall.

LiDAR Point Cloud and Mapping via Transformation and Inverse Transformation
We first define the elevation angle of the n-th ring as , the swing angle of the a-th ray on the same LiDAR image ring as , and the individual ray distance is detected from the returned signal through time-of-flight conversion into distance ̂. in meter. The Li-DAR data point can be presented in a three-dimensional tensor as . = (̂. , , ). The spherical transformation matrix from the spherical coordinate to the rectangular coordinate system can be expressed as follows.
The LiDAR point � , can then be presented in the rectangular coordinate system as follows.

LiDAR Point Cloud and Mapping via Transformation and Inverse Transformation
We first define the elevation angle of the n-th ring as Ψ n , the swing angle of the a-th ray on the same LiDAR image ring as α a , and the individual ray distance is detected from the returned signal through time-of-flight conversion into distancer n.a in meter. The LiDAR data point can be presented in a three-dimensional tensor as Λ n.a = Λ(r n.a , Ψ n , α a ). The spherical transformation matrix from the spherical coordinate to the rectangular coordinate system can be expressed as follows.
R(Ψ n , α a ) =   cosα a cosΨ n cosα a sinΨ n sinα a sinα a cosΨ n sinα a sinΨ n −cosα a sinΨ n −cosΨ n 0 The LiDAR pointP n,a can then be presented in the rectangular coordinate system as follows.P Equation (2) is considered as the transformation from Λ n.a toP n,a . The inversion of a point in the Cartesian coordinate system to the three-dimensional LiDAR data tensor is given as follows.
The procedure with discretizing the analog value of Ψ and α into the discrete value of Ψ n and α a according to the LiDAR resolution is defined as the inverse transformation of Γ.
Following the convention of coordinate frame transformation used in robotics, the location and orientation in the Cartesian coordinate system is defined as the LiDAR pose frame which is given as follows.
k L x denote the abosolute x translation, k L y denote the absolute y translation, k L γ is the abosolute LiDAR azimuth angle, and k L p is the abosolute LiDAR pitch angle in the view of global coordinate system. Converting the cloud point 1 1 Λ from its LiDAR pose frame 1 L to the LiDAR pose frame 0 L denoted as 0 1 Λ, we will have to go through the procedure as follows.
1P is the Cartesian coordinate in pose frame 1 L.
0 1 Λ is the LiDAR data point obtained in pose frame 1 L and converted into the LiDAR data point in pose frame 0 L. 0 1 Θ is a homogenous transformation from pose frame 1 L to the pose frame 0 L, which will be described in the following sections. The consecutive LiDAR motion may also be updated in a sequence of transformations from LiDAR pose k L and mapped back to the coordinate frame defined from the LiDAR pose frame 0 L.
After the LiDAR transformation Γ and its inverse transformation Γ −1 are adequately defined for a particular LiDAR such as a LiDAR with 32 rings and one degree azimuth resolution, one is able to move the LiDAR from one position to the other position to consecutively gather more points into the cloud of the frame. This procedure is known as mapping. The point cloud mapping shall be much clearer to the 3D scene of a space such as a room of a lobby. The conventional way to do the mapping is based on data fusion from different sensors, such as the camera, the IMU, GPS, and the accelerometer. The data fusion is costly in either processing time or money in equipment as well as their installation, calibration and maintenance. This paper proposes a method which strictly utilizes the LiDAR sensor to achieve mapping provided that the lattice structured walls can be seen all the time through the vision of the LiDAR.
After a series of point cloud conversions, we can obtain multiple points with different r n.a 's but same Ψ n , α a . The procedure to gather LiDAR data points is the first step of mapping. It may result that a mapping conflicts with one another from a different view point of space. Figure  to k L. From the view of 0 L, the ray distance of 0 0 Λ is shorter than when it is mapped from k L which is 0 k Λ. In this case, we will have to change the room index and store the adjacency information of the neighboring rooms. In case of a partitioned room, we may need multiple buffers of point cloud data to describe the same room which will be introduced in the following sections. Figure 2b depicts the wall that is aligned with one of the ray directions that is unobservable by the LiDAR scan. It is then necessary to properly select the reference frame 0 L that can represent all important features clearly.

LiDAR Mesh and Filtering
The LiDAR points form a set of grid points different from the rectangular grids on the wall even when the LiDAR is accurately positioned upright as shown in Figure 3a. The grid points are in a bell shape of which the height is smaller where the LiDAR is closer to the wall. The grid points may be converted into polylines horizontally for each ring and vertically for each azimuth angle. The horizontal and vertical line segments form a mesh m(kT) ≡ k m at the k-th sampling time T period as shown in Figure 3b. The mesh is composed of different quads, each of the individual quads has its four vertices and a normal vectorn Q,n,a of the quad Q n,a . It may be determined from its four vertices. n Q,n,a = P n,a+1 −P n,a × P n+1,a −P n,a P n,a+1 −P n,a × P n+1,a −P n,a The mesh k m may be filtered through three proposed conditions as follows.
(a) The cluster condition: the far away points resulted from the mirror reflection or through the windows shall be removed by the filter; (b) The vertical condition: the quads Q n,a s whosen Q,n,a composes a large z component formed from the discontinuity of the surfaces. For example, the adjacent rays which one shoots to a wall and then shoots to a cabinet surface individually, shall also be removed; (c) The aspect ratio condition: the aspect ratio of the quad Q n,a , which are distorted too much due to the mirror reflection or complicated shape of the facilities existing in the offices, shall be filtered out.  The remaining LiDAR mesh k m as shown in Figure 3b may be discontinuous in either direction of kX , kŶ , and kẐ . The discontinuities can be used later for segmentation of different walls composing the boundary of the rooms.
After the processing of the proposed conditions, the LiDAR mesh will keep only those with grid patterns. The evidence for this is depicted in Figure 3c as an example. The doom screen in the room is filtered out from the proposed conditions, which is not grid-like. The proposed method is effective only for the vertical walls, doors or cabinets which have the grid pattern. It is common that the walls are used to partition spaces in typical office buildings.

Mesh Projection and Initial Axis Finding
After the filtering, we can determine a bounding box B(kT) ≡ k B for the mesh k m, which is formed from the envelope of the cloud pointsP n,a collected during the sampling time T. It is also assumed that the LiDAR is positioned rather vertical to the ground and it is in a square room when k = 0. The bounding box 0 B is then projected on theXŶ plane. The LiDAR image projection is rotated on theXŶ plane with an angle γ about theẐ axis to find the minimum area of 0 B(γ) under the rotation of γ. At the minimum area rotation, we determine the principal axis as the X and Y axes of this room and, if necessary, the global coordinate system as shown in Figure 4d. After the above procedures, the normal vectorn Q,n,a of the quad Q n,a is used to calibrate the pitch angle of the LiDAR which can be shaken by the vehicle vibration. We first identify the individual quads Q n,a s which fulfills the following conditions and categorize them into X walls and Y walls.
δ is a small value, which is recommended to be 0.02. Taking the average of the X wall normal vectors to obtain a unit vectorn Xwall and Y wall normal vectors to obtain a unit vectorn Ywall and taking cross product on the two average normal vectors, we are able to determine the floor normal vector as follows.
The rotational transformation of the LiDAR point from the LiDAR Coordinate System XŶẐ to the global coordinate system can be achieved by the following operators.
R denotes the rotational transformation as follows.
The angles α and s with respect to the X and Y walls respectively are given in the following.
After the above transformation, we can align the initial LiDAR image and identify the axes of the global coordinate system. The minimum area method as the search for the rotation angle γ about theẐ axis is valid in the initial stage of the e-SLAM. The following conditions are preferred to establish the first LiDAR post frame 0 L with good accuracy.
(1) Rectangular room condition: the room where the IMR starts must be a rectangular room with orthogonal walls to allow the minimum area method to find the principle axes properly; (2) No mirror condition: the room must initially be with no mirror reflecting the LiDAR ray. Without this condition, the aliasing LiDAR data can misinterpret the rectangular room.
These conditions may be unavailable in a situation where the IMR is started from the center of the lobby or an open space as shown in Figure 5a. However, in many situations, there are good long reference walls, the minimum area method will bring the bounding box to align with the longest wall as shown in Figure 5b. On the other hand, if there is a mirror in the initial position, then the aliasing LiDAR data that expand the room can mislead the minimum area method to an incorrect rotation as shown in Figure 4e. Drawbacks of the minimum area method include ±90 • rotation error, the fluctuation of the rotation due to the unclear wall information, and the large angular error due to the envelope of the LiDAR image which is not rectangular in theXŶ plane projection.
there are good long reference walls, the minimum area method will bring the bounding box to align with the longest wall as shown in Figure 5b. On the other hand, if there is a mirror in the initial position, then the aliasing LiDAR data that expand the room can mislead the minimum area method to an incorrect rotation as shown in Figure 4e. Drawbacks of the minimum area method include ±90 rotation error, the fluctuation of the rotation due to the unclear wall information, and the large angular error due to the envelope of the LiDAR image which is not rectangular in the � � plane projection.

Wall Corner as Land Mark
The intersection of X and Y walls are the wall corner, which can be either an existing or a virtual corner. For example, there could be a wall on the corridor referred to as the X wall as viewed from the room while the other side of the corridor walls are referred to as the Y walls. There are not any actual corners, hence, called the virtual corner in the corridor as shown in Figure 6. The wall corners or virtual corners can be utilized as the land mark of the translation calculation, that is, how far the IMR (indoor mobile robot) is moving away from or approaching a known location landmark. The walls are extracted from the quads which are marked as the X walls and Y walls. The importance of the wall is given with a weight proportional to the ray distancê r which cannot be affected by the rotation pivoted at the LiDAR location. Hence, the histogram on X and Y directions may be written as follows.
The histograms have their stationary values defined as follows.
∂ ∂y These stationary values can be sorted to obtain a set of wall corners or virtual corners as follows.
The displacement k x c,i and k y c,i satisfy the following conditions in k L pose frame.
In the set of corners, k P c ≡ k P c,1 is defined as the location of the most significant corner (MSC) k C discovered at k-th sampling time. k P c,1 satisfies the following minimax condition.
Note that k C is a feature corner which is only an abstraction that the feature identification shall be achieved either in the histogram correlation or the LiDAR image comparison.

LiDAR with IMR Navigation
In the initial stage, i.e., k = 0, the position of the most significant corner 0 P c can be used to define the pose frame 0 L when 0 γ is derived from the minimum bouning box 0 B 0 γ . The pitch angle is assumed eliminated during the mesh conversion process stated in Section 2.3 "Mesh Projection", which is set to zero. Hence, the LiDAR pose frame is reduced into a three-dimensional tensor as follows.
The initial pose frame indicates that the origin of the pose frame is located at 0 x c,1 and 0 y c,1 offset with respect to the coordinate system when applying the rotation of 0 γ angle about 0Ẑ . The displacement and azimuth rotation of the LiDAR pose frame L(kT) ≡ k L during the later IMR navigation are calculated according to the initial pose frame 0 L.
The initial position of the IMR or LiDAR is located at − 0 P c , which is the origin of the pose frame 0 L. The initial orientation of LiDAR may be derived from the zero-azimuth angle direction relative to the X axis. In case that the most significant corner remained in the k-th sampling time of the LiDAR pose frame as shown in Figure 7, the translation k N t and the rotation angle k N γ of the IMR or LiDAR navigation can be written as follows. It is convenient to attach the LiDAR pose frame to the actual pose of the IMR in the k-th sampling time. The translation and rotation can be calculated through the coordinate transformation such as the conventional robotics. When the LiDAR transformation Γ and its inverse transformation Γ −1 are available, one is able to be move the LiDAR from one position to the other position to gather more and more points into the cloud frame. The mapping is achieved by converting the LiDAR point cloud from one scene to the other. The mapping becomes practical when the following two assumptions are fulfilled.
(1) The continuity of the rotation between LiDAR pose frames; (2) The consistency of the most significant corner tracking is maintained even when the most significant corner changes through time.
To ensure both assumptions, the consistency of point cloud information has to be used in the verification. However, variations of the environment, such as objects moving, human coming into or leaving the scene, can create uncertainties to the consistency comparison. We then need a stochastic process to help us estimate the pose from the previous time to the consecutive time. Following the stochastic estimation, the rotation and translation finding procedures are discussed in the following sections.

Linear Quadratic Estimation for IMR Pose Prediction
The Linear Quadratic Estimation (LQE) may be applied to predict the LiDAR pose k L and its derivative k . L through the following update scheme.
k−1 L is the actual pose computed through the rotation and translation computation, which will be introduced in later sections. The actual velocity k−1 .
L may be updated using backward difference method and can be set to zero when it is not observed by the observation matrix H. k K is the Kalman gain at the k-th sampling, which is updated as follows.
k k−1 P is the prior error covariance matrix at the k-th sampling, which is given as follows.
k P is the posterior error covariance matrix at the k-th sampling, which is updated from the prior error covariance matrix as follows.
The transition matrix A can be derived from Newton's law of motion as follows.
The observation matrix H may observe only the pose and not the velocity of IMR as follows.
The noise covariance matrices R 6×6 and Q 6×6 are related to the variance of the position and velocity individually. LQE yields the prediction of the position as well as the rotation of the IMR which carries the LiDAR. The most important feature is that LQE can yield a filtered result of k . L prediction which leads to a range for the rotation and translation search of the LiDAR mesh k m.

Rotation Update Scheme Based on LQE
As stated previously, the minimum bounding box area method can be used in finding the rotation only during the initial stage. For the remaining steps k > 0, we need to find the rotation k γ based on the previous rotation angle k−1 γ which is the previous LiDAR orientation. k γ can only be in the range of α 0 k .
L γ T ± α 1 in the vicinity of k−1 γ, where α 0 is a factor of allowable variation and α 1 is the allowable rotation change when the previous rotation speed is zero. We can apply a rotation angle k γ to rotate the mesh points as follows.
L γ T is the estimated rotation from the Kalman filter denoted as ∆γ in Figure 8. The purpose of the rotation test is to find the minimum rotation difference between k γ and k−1 γ allowing the histogram values of the walls, i.e., k H x and k H y , to be maximized. These can be done from one of the three methods stated as follows.
(1) Minimum bounding box area method: the same method for the rotation finding as in the initial stage, which may be working for the early time when k is small; (2) Maximum total histogram value: maximize ∑ k H x + ∑ k H y ; (3) Maximum total number of quads on the bounding box; maximize the members in the set Q n,a Q n,a ⊂ k−1 B . The practical way is to use all three methods simultaneously and verify which one satisfies Equation (29). If all three methods fail, then find the k γ which minimize k γ − k γ .
The incremental rotation can bring the mesh to align with the previous LiDAR mesh k−1 m. The remaining difference between k m and k−1 m is the coincidence of the wall corners between the meshes.

Translation Update Scheme Based on Most Significant Corner (MSC) Transfer
In case that the k 0 C is not any more the most significant corner in the k-th sampling as shown in Figure 9, we will have to update the IMR navigation in the following way. The vector opposite to k 0 P c is the location of the most significant corner 0 C in pose frame k L. Note that 0 C is a feature corner which is only an abstraction that the identification of the feature shall be achieved either in the histogram correlation or the LiDAR image comparison. When the LiDAR image comparison of the entire space is not computationally efficient, then the histogram correlation may not be accurate enough. An efficient way to distinguish a feature corner may be done while updating. In the linear quadratic estimation, we are able to estimate the IMR navigation velocity k . L from Equation (23). Based on the estimation, we can predict the location of the 0 C in pose frame k L through computing The feature corner 0 C shall be in the neiborhood of − k L t + k 0 P c in LiDAR pose frame k L. We only check the vicinity of − k L t + k 0 P c of the histogram k H x and k H y to locate the wall corners k P c,i , which must satisfy the following condition.
r s denotes the radius of tolerance of search. The one which uses histogram search is more efficient, thus can be processed before the LiDAR image comparison. It could sometimes happen that the feature corner 0 C is missing in the frame of k L since it may be blocked by moving objects such as humans. We then have to locate the feature corner K C in LiDAR pose frame k−1 L as follows.
On the way that IMR navigates, the MSC can change from one to the other and sometimes it can even loop back to an early used MSC. The update scheme for translation can then be written as follows.
i i−1 P c is a zero vector when the MSC is not changed from pose frame i − 1 to i. The LiDAR images are comparing to the LiDAR mesh k m and k−1 m projection on their individual XY plane and finds the maximum correlation between the mesh image, which is the same as what has been done in conventional image processing. The translation found between k N t and k−1 N t can be thought as k 0 P c . The image of LiDAR mesh can be first projected to the XY plane and compared to one another after the rotation update. The image correlation between two projected pixel images can yield the translation between k N t and k−1 N t in a precise way, however, this is a time-consuming process and is only acquired when necessary.

Floor Management
There could be only one LiDAR pose frame 0 L for the entire floor to which all LiDAR mesh data will be mapped back. However, the maximum number of quads is a finite number according to the azimuth and number of rings. In case of a complicated space such as the office building, due to several rooms and partitions, it becomes impossible to map all walls of different rooms into one simply connected mesh. Multiple connected meshes are necessary to contain the complete information of different rooms. As for floor management, we would need different room mapping even if there is only one LiDAR pose frame 0 L, called the base pose frame.
As stated in Section 2.2, three proposed conditions are applied as the filters to form the LiDAR mesh, which include (a) the cluster condition that is used to remove the unwanted information from the mirror reflection and those from the walls outside of the transparent windows, (b) the vertical condition to remove the passer-by and the static furniture data, such as sofa and tables, and (c) the aspect ratio condition to remove the stairstep surfaces which are not continuous walls. The adjustment of parameters used in the three filtering conditions can improve the adaptation to office buildings of different kinds. The preferences stated in Section 2.3 are that the initial location we find to turn on the IMR is better to include the rectangular room condition and the no mirror condition for increasing the estimation accuracy of the minimum area method. Since the base pose frame 0 L is the basis of the IMR exploration, it determines the accuracy of the Lidar mesh map for a room. IMR navigates in the environment with assumed conditions that (1) the rotation between LiDAR pose frames is continuous, i.e., it will not fumble, and (2) there is at least one most significant corner that we can use to form the LiDAR pose frame and track the IMR navigation. These assumptions and presumptions about the environment are not limiting the use of e-SLAM, while on the other hand, they are the dimensions to improve the applicability of the e-SLAM.

Single Room Mapping
After a series of point cloud conversions of multiple points of differentr n.a 's and same Ψ n , α a are mapped into base pose frame 0 L. 0 k Λ is the LiDAR data point obtained in pose frame k L and converted into the LiDAR data point in pose frame 0 L. 0 k Θ is a homogenous transformation from pose frame k L to the pose frame 0 L, which can be written as follows.
The total LiDAR image could be computed from the average of all range datar n.a on the same Ψ n and α a as follows.
N n,a denotes the number of non-zero range data ir n.a from all pose frame i L. In order to filter away the LiDAR data which are from the moving objects such as passer-by, we extract the LiDAR mesh data representing only the walls via the filtering conditions including the vertical condition and the aspect ratio condition as stated in Section 2.2. The wall mesh of the same area is repeatedly obtained from different LiDAR pose frames as the IMR navigates, and thus, the wall smoothing is achieved when the moving objects are far away.

Room/Corridor Segmentation
The range data of a single ray will be used to represent two walls, as stated in the previous section. There is a need to separate rooms that are partitioned by walls. We also need to label each room with an individual room index providing that the room adjacency matrix is documented. It may be convenient to setup the IMR travel distance for determining the bounding box k B, and after determining the bounding box, the room size is fixed. It is then using the information of the IMR crossing the boundary of the box k B to determine the wall partitioning the rooms is shown in Figure 10. When the corridors are partitioned into two rooms, there is no actual wall, however, a virtual wall is partitioning the space. As shown in Figure 10, when the IMR is leaving an initial room (blue box) and going forward for further exploration, the data on the left and right sides of the room are the LiDAR noise came from either the mirror reflection or the transparent windows. The partition wall that provides a border to the new room can filter away those noise and initialize the LiDAR mesh for a new room reached, i.e., a corridor (yellow box) in this case. As shown in Figure 10, when the IMR is turning left to a new room (yellow box) as an example, the previous data obtained from the LiDAR visible region shall be granted to the early room (blue box). A virtual partition wall (not existing) yields an open space in the mesh data storage for the new room to form its new room boundary.
the LiDAR noise came from either the mirror reflection or the transparent windows. The partition wall that provides a border to the new room can filter away those noise and initialize the LiDAR mesh for a new room reached, i.e., a corridor (yellow box) in this case. As shown in Figure 10, when the IMR is turning left to a new room (yellow box) as an example, the previous data obtained from the LiDAR visible region shall be granted to the early room (blue box). A virtual partition wall (not existing) yields an open space in the mesh data storage for the new room to form its new room boundary.

Experiment and Comparison
A tadpole model of a tripod IMR is used in the experimental study. A traction inwheel motor is applied on a single rear wheel which is integrated with an electric steering system on the top of the steering column, and two passive wheels are installed on the front wheel with 20 × 1.65 tube tire type. A VLP_16 LiDAR with 16 rings made by Velodyne Inc. is positioned onto the center of the front axle with a total height of 1.65 m from the ground as shown in Figure 11a. The tripod IMR has a turning radius of 1.2 m and is driven by people during the tests. There was an electrical control box that enclosed a 24-V drive with EtherCAT communication protocol for steering, a 48-V drive for traction controls, a 45-Ah 48-V Li-ion battery for traction system, and an IMU (Microstrain ® 3DM-GX1) [71]. The IMU integrating three angular rate gyros with three orthogonal DC accelerometers, which were used for the later verification purpose but not for the localization purpose. The test environment is the Engineering Building V in the Kwang-Fu Campus of the National Yang-Ming Chiao-Tung University. It is an I-shape building with two wings and a connection section. The experiment was performed on the fifth floor with a floor area of around 5000 m 2 where the top view is shown in Figure 11b. The wall material is made from concrete and white fiber cement layered onto the wall surface with a solar reflectance

Partition Wall
Virtual Partition Wall

Experiment and Comparison
A tadpole model of a tripod IMR is used in the experimental study. A traction in-wheel motor is applied on a single rear wheel which is integrated with an electric steering system on the top of the steering column, and two passive wheels are installed on the front wheel with 20 × 1.65 tube tire type. A VLP_16 LiDAR with 16 rings made by Velodyne Inc. is positioned onto the center of the front axle with a total height of 1.65 m from the ground as shown in Figure 11a. The tripod IMR has a turning radius of 1.2 m and is driven by people during the tests. There was an electrical control box that enclosed a 24-V drive with EtherCAT communication protocol for steering, a 48-V drive for traction controls, a 45-Ah 48-V Li-ion battery for traction system, and an IMU (Microstrain ® 3DM-GX1) [71]. The IMU integrating three angular rate gyros with three orthogonal DC accelerometers, which were used for the later verification purpose but not for the localization purpose. The test environment is the Engineering Building V in the Kwang-Fu Campus of the National Yang-Ming Chiao-Tung University. It is an I-shape building with two wings and a connection section. The experiment was performed on the fifth floor with a floor area of around 5000 m 2 where the top view is shown in Figure 11b. The wall material is made from concrete and white fiber cement layered onto the wall surface with a solar reflectance of 0.40. There are four tests moving repeatedly from the hall to the lab. A simple text file in pts format is used to store point data from LiDAR scanners. The first line gives the number of points to follow. Each subsequent line has 7 values, the first three are the (x, y, z) coordinates of the point, the fourth is an "intensity" value, and the last three are the (r, g, b) color estimates. The intensity value is an estimate of the fraction of incident radiation reflected by the surface at that point where 0 indicates no return while 255 is a strong return. In the lab, there is a dome screen made of wood, which occupied one quarter of the lab space. There is also a mirror on the wall adjacent to the door which can produce fake feature data by reflecting the laser light. As shown in Figure 11c, the IMR is in its upfront position in the room toward the door. The angular difference between the LiDAR data and the LiDAR mesh is the initial azimuth angle α 0 . After the initial azimuth angle is obtained from the proposed method in Equation (15), one can reinstall the LiDAR by manually rotating the LiDAR fixture on the IMR to align the zero-azimuth angle to the upfront direction. There will still be some inaccuracy left behind needed to be verified through the experiment. fake feature data by reflecting the laser light. As shown in Figure 11c, the IMR is in its upfront position in the room toward the door. The angular difference between the LiDAR data and the LiDAR mesh is the initial azimuth angle 0 . After the initial azimuth angle is obtained from the proposed method in Equation (15), one can reinstall the LiDAR by manually rotating the LiDAR fixture on the IMR to align the zero-azimuth angle to the upfront direction. There will still be some inaccuracy left behind needed to be verified through the experiment. There are two tours from the lab to the hall and from the hall to the lab. Each of the tours has been performed twice. The intent of the course of the experiment is to prove the generality of the proposed method as follows.
(1) In the course of the experiment, IMR went from a door-closed room to the corridor after the door is opened. The challenges include the complication of LiDAR mesh formation from the closed area to an open area, and also the floor management. (2) The IMR navigates from a room to the corridor and to the elevator space where many passengers (moving objects) are presented, and then, to the lobby which is an open There are two tours from the lab to the hall and from the hall to the lab. Each of the tours has been performed twice. The intent of the course of the experiment is to prove the generality of the proposed method as follows.
(1) In the course of the experiment, IMR went from a door-closed room to the corridor after the door is opened. The challenges include the complication of LiDAR mesh formation from the closed area to an open area, and also the floor management. (2) The IMR navigates from a room to the corridor and to the elevator space where many passengers (moving objects) are presented, and then, to the lobby which is an open space. The corridor navigation is a challenging part, which has very weak MSC information when the walls do not intersect. (3) The IMR makes two 90 degree turns, one right and one left, between two complete stops. The displacement and orientation estimation of the proposed method can be fully tested.
During the tour from the lab to the hall, there are four most significant corners (MSC) found in the lab, more than 10 MSC's found on the corridor and many other MSC's found during the entire IMR navigation. There is one person driving the IMR manually by its handle bar which includes the throttle and brake to control the IMR. The person who controls the handle bar is at his lowest position as possible to avoid blocking the LiDAR rays from shooting on the high side of the walls.
We chose only point clouds from positive ring angles Ψ n to form the LiDAR mesh in the experiment. We assumed that the high portion of the wall poses more wall corner information than the lower portion does. Thus, in VLP_16, there are Ψ n = 1, 3,5,7,9,11,13,15 degrees that can be utilized to form the mesh. In different experiments, we have collected 550 to 750 LiDAR data frames from four different IMR navigations within 80 to 120 s. The frame rate is around 5 data frames per second that we recorded. The total distance accumulated during the navigation is about 45 m. The average speed of the IMR is 2 kph (km/hour) which is a medium speed in the indoor mobile robot application. During the navigation, there are two T-junctions to turn and the IMR will stop when there are people walking by. The highest speed of the navigation was 4 kph during the course. In Figure 12, we demonstrate the tour from the lab to the hall. There is a dome screen which is nearly one quarter area of the lab, which also provides a counterexample of vertical walls in this room. There is also a mirror on the side of the entrance door, thus providing a LiDAR error source. The base pose frame 0 L is chosen to be the first found MSC when IMR starts the navigation, which can also program to other locations in order to have a better resolution of the entire space with a different room. In Figure 12a, we can find two images, for which the real time LiDAR data and LiDAR mesh were calculated. They almost coincide with each other just because when the IMR started navigation, the LiDAR azimuth angle α 0 is pre-aligned to front. The alignment of LiDAR azimuth angle α 0 to the front of the IMR is not always true in applications of multiple LiDAR systems. Indeed, it is very difficult to align the LiDAR azimuth angle α 0 precisely to front and the misalignment can be calculated during the first minimum bounding box test. The angle between the bounding box and the IMR is the misalignment angle. In Figure 12b, we demonstrate the circular motion of the IMR. During the circular motion, the IMR merely moved and the LiDAR scanner senses only the rotation. In Figure 12c, when the IMR reaches the T-junction between the hall way and the corridor, the IMR rotates again. Between Figure 12b to Figure 12c, there are many number of MSC jumps since the corridor does not have the significant MSC which is seen by the LiDAR scanner. Thus, MSC may jump to the frame of the door and also to the bulletin box on the walls which indeed needs the LiDAR mesh image comparison between the room and the current rotated LiDAR mesh data stated early in the final paragraph in Section 2.8 "Translation Update Scheme based on Most Significant Corners (MSC) Transfer". In our software, we will send a beep sound out from the controller of the IMR which signals the difficult situation and possibly error translation updated. In practice, there will be many other sensors such as GPD and IMU in addition to the LiDAR scanner helping advise on the correction translation. However, in this experiment, we purposefully do not acquire the information from the other sensors to fully understand the reliability of the e-SLAM method for the localization of IMR. In Figure 12d, when the IMR is moving to the elevator room, there are passengers waiting for the elevator, while one of the passengers was entering the elevator, thus, the elevator door was opening and closing. In Figure 12e, IMR reached the hall and before that there was a person walking through the hallway and encountered the IMR. It can be seen that the floor management is performed to separate the entire course into 4 rooms where one is adjacent to the other. At the top view of the map produced from the e-SLAM as shown in Figure 12f, there are several things found which shall be discussed.
(1) The LiDAR azimuth angle α 0 is not zero. From the top view of the LiDAR mesh image, it is found that the final LiDAR data are not perfectly aligned with the map. (2) The room near to the base pose frame 0 L gets better resolution. It can be observed from Figure 12f that the Lab with more green dots representing the LiDAR quads and the lobby received fewer dots. It is because the resolution of the LiDAR quad is limited by the distance from its base pose frame 0 L. We can have a quad with one meter height because the distance was 30 m away from the base pose frame 0 L and the ring angle resolution is 2 degrees. precisely the space when it was on the corridor and the hallway. This can cause an even bigger problem when IMR travels on a long corridor. The other test is conducted by moving the IMR back to the lab. In Figure 13a, we see that the LiDAR data on the vicinity of the hallway had ring arcs in different wave spacings. The denser ring arcs are the ceiling. The sparser ring arcs are the hallway. It looked like the hallway had an inclined floor surface, which is actually flat and horizontal to the ceiling, since the wave spacing changes of both floor and the ceiling are the same. In Figure 13a, the rectangular bounding box can still be found in even a very complicated space. The base pose frame 0 L is also chosen to be the first found MSC when IMR starts the navigation. However, this time, the MSC formed is a virtual corner which is the intersection of non-touching walls. In Figure 13b, the IMR travels through the hallway and sees the stair room, which is then moving to the T-junction between the hallway and the corridor as shown in Figure 13c. In Figure 13d, the IMR reaches the door of the Lab and IMR continues to go further into the Lab. Figure 13f shows the top view of the map produced from the e-SLAM, it can be observed that the room index is not successfully provided to the Lab because of the low resolution of the Lab when it is far away from the base pose frame 0 L. It is also observed that the third room including the hallway, the corridor and the Lab cannot be separated from the hallway because there is no good separation line found when the IMR moved from hallway to the corridor. The T-junction occupied by the hallway room blocked the corridor to be formed as a unit of room.
The other two tests are repeated from the previous two tests, which are used to compare the rotation and translation reading of the IMR based on the active localization based on the e-SLAM method. Each of the tests are performed with different navigation speeds and waiting times when encountering moving persons, with different number of data frames from 550 in Test#3 to 750 in Test#1. The results are compared in Figure 14, the x-distance between the lobby anchor points to the Lab anchor point is about 30 m, the lobby anchor point to the T-junction is about 22 m, and the T-junction to the position in front of the door is about 11 m. Each time, the IMR trajectory may be different by 1 m. The direction of IMR was rotated at 180 degrees every time before the next test started. Thus, their coordinate systems are different in x-direction, i.e., the x-direction of the tests from lobby to Lab is opposite to that of the tests from Lab to lobby. They matched in dimensions of the space within meter accuracy. The amount of error may due to the causes in the control and analysis.
(1) The trajectory difference: There have been real situations in different IMR navigation tests including passer-by, elevator passengers, and environmental conditions such as door open/close. These situations caused the non-holonomic robot control system to detour from the predefined runway, which has been recorded in video files allowing replay to verify. So far, it contributes 50% of the error based on the video replay data. (2) The starting and ending position difference: The starting positions are marked before each test, however, precisely arriving at the same mark as the ending position is difficult for the IMR due to the position control being done manually using electrical throttle and brakes as stated before. The differences contribute 30% of the error so far. (3) The e-SLAM localization error: There still were some errors found from the simulation analysis in the localization process. These errors will show the LiDAR meshes slightly shaking on the screen in the simulation streaming. The computation will contribute 20% of the error based on observing the simulation streaming.   1  24  47  70  93  116  139  162  185  208  231  254  277  300  323  346  369  392  415  438  461  484  507  530  553  576  599  622  645  668  691  714  737  The error of Test#4 is largest among all four tests, which may be caused by the e-SLAM localization error. Thus, we may conclude that the deviation of the active localization of e-SLAM is within 50 cm. This error may need the compensation from other sensors such as IMU or high precision GPS to compensate. It may also acquire the precise calibration of the locations from the image recognition of fiducial marks. As for the rotation based on the active localization, all four tests are done by first going along a straight line with no rotation, then turn to the right by 90 degrees, then travel on the corridor in a straight line, turn to the left by 90 degrees, and then going straight. Since they have different number of data frames, they cannot be compared at the same time count. The result in Figure 15 shows that there is always an overshoot during rotation which was actually happening during the IMR navigation. The maximum rotation overshoot is around 20 degrees. Comparing to the actual data frames played as a movie, we can observe only less than 5 degrees of error for the rotational localization, thus, we can conclude the error of rotation is within 5 degrees. The error may be compensated by the gyroscope in the high precision IMU. The initial azimuth angle α 0 does not affect the relative rotation information between LiDAR pose frames which yields the rotation of the IMR. Under the condition mentioned previously that the LiDAR azimuth angle α 0 is non-zero, the rotation results as shown in Figure 14 have two steady states on precisely 0 and 90 degrees which are not affected by the LiDAR azimuth angle α 0 . Hence, further calibration of the initial LiDAR azimuth angle α 0 is unnecessary, and it can reduce the maintenance work.    1  24  47  70  93  116  139  162  185  208  231  254  277  300  323  346  369  392  415  438  461  484  507  530  553  576  599  622  645  668  691  714  737  In order to show the significance of the proposed system, we also show the comparative data obtained from the IMU shown in Figure 16. The rate for Euler angles is 100 Hz and for accelerometer data is 350 Hz from the IMU. The horizontal axis is the data count instead of the actual time. Figures 15 and 16 are from different sampling rates collected from different routines, thus, they had different data counts. The IMU data were recorded on the tour of Test#4. It is found that the rotation angle (blue line) in Figure 16 is coarsely comparable to the Test#4 result (yellow line) shown in Figure 15, however, IMU rotation data are noisier than those found from the e-SLAM result on straight line sections. The overall travel distance on the tour of Test#4 shall be 41 to 42 m according to Figure 12, which is different from the translation of 75 m calculated from the accelerometer result shown in Figure 16. The IMU travel distance error may be due to the calibration of the accelerometer. As a result, the e-SLAM method proposed in this paper is suitable for position accuracy as well as rotation stability. The e-SLAM is not an integral-based sensor, which is free from accumulation error. In summary, the e-SLAM includes the procedures of (1) finding the initial rotation using the minimum bounding box method, (2) finding the most significant corner from the walls vertical to the floor based on the histogram analysis, (3) choosing the base LiDAR pose frame and doing the inverse transformation to map all LiDAR data back to the room In summary, the e-SLAM includes the procedures of (1) finding the initial rotation using the minimum bounding box method, (2) finding the most significant corner from the walls vertical to the floor based on the histogram analysis, (3) choosing the base LiDAR pose frame and doing the inverse transformation to map all LiDAR data back to the room map, (4) performing the active localization based on the rotation and translation update scheme, (5) utilizing the least quadratic estimation (LQE) method to perform the localization estimation, (6) utilizing the LiDAR data XY projection image to assist the translation update, (7) performing the room segmentation, and (8) performing the floor management. The flow chart of the e-SLAM processing is shown in Figure 17. In summary, the e-SLAM includes the procedures of (1) finding the initial rotation using the minimum bounding box method, (2) finding the most significant corner from the walls vertical to the floor based on the histogram analysis, (3) choosing the base LiDAR pose frame and doing the inverse transformation to map all LiDAR data back to the room map, (4) performing the active localization based on the rotation and translation update scheme, (5) utilizing the least quadratic estimation (LQE) method to perform the localization estimation, (6) utilizing the LiDAR data XY projection image to assist the translation update, (7) performing the room segmentation, and (8) performing the floor management. The flow chart of the e-SLAM processing is shown in Figure 17.

Conclusions
The method of exploration-based SLAM (e-SLAM) for the indoor mobile robot using LiDAR is introduced in this paper. The e-SLAM method with the mapping using LiDAR mesh of vertical walls can trim the static furniture and the moving object from the same space of the IMR through the proposed conditions in Section 2.2 for the LiDAR mesh formation. With the help from LQE estimation on the translation and rotation, LiDAR can be the only sensor used for the active localization. The computation time is 90 s to process the longest PTS file with 750 LiDAR data frames, which recorded 120 s of the IMR navigation, using a computer with a Windows10 operating system and an Intel(R) Core (TM) i5-8500 CPU @ 3.00 GHz 3.00 GHz CPU and 4 GB memory. The software was running in a single thread executable code. The e-SLAM can achieve the real-time processing. The precision of e-SLAM is 50 cm in translation and 5 degrees in rotation. When the most significant corner in starting position of the IMR is made as the base pose frame, the maximum distance for the 16-ring LiDAR can go as far as 35 m away to persist good active localization. When the space requires more than 35 m, multiple base pose frames can be used to increase the resolution during the XY projection image comparison. The experiment was conducted during the office hour of NYCU engineering building when many moving objects such as humans and elevator doors were into the LiDAR scanning. The reliability of the e-SLAM method is preliminarily proven. There are still problems left unsolved to the future, which include the method to handle multiple base pose frame 0 L and a better room segmentation method to handle the long corridors. Nevertheless, there is a need to improve the translation update accuracy.