Efﬁcient and Safe Robotic Autonomous Environment Exploration Using Integrated Frontier Detection and Multiple Path Evaluation

: Autonomous exploration and remote sensing using robots have gained increasing attention in recent years and aims to maximize information collection regarding the external world without human intervention. However, incomplete frontier detection, an inability to eliminate inefﬁcient frontiers, and incomplete evaluation limit further improvements in autonomous exploration efﬁciency. This article provides a systematic solution for ground mobile robot exploration with high efﬁciency. Firstly, an integrated frontier detection and maintenance method is proposed, which incrementally discovers potential frontiers and achieves incremental maintenance of the safe and informative frontiers by updating the distance map locally. Secondly, we propose a novel multiple paths planning method to generate multiple paths from the robot position to the unexplored frontiers. Then, we use the proposed utility function to select the optimal path and improve its smoothness using an iterative optimization strategy. Ultimately, the model predictive control (MPC) method is applied to track the smooth path. Simulation experiments on typical environments demonstrate that compared with the benchmark methods, the proposed method reduce the path length by 27.07% and the exploration time by 27.09% on average. The real-world experimental results also reveal that our proposed method can achieve complete mapping with fewer repetitive paths.


Introduction
Autonomous exploration enables robots to be capable of actively perceiving the environment, which has played an increasingly important role in various applications, such as monitoring environmental quality [1,2], precision agriculture [3], search and rescue [4][5][6], and open-sea exploration [7,8]. There has been some valuable work on navigation using GNSS [9,10], but in most cases, indoor robots can only rely on their carried sensors for navigation. However, the limited perception range of the sensor and the absence of any prior information about the surrounding environment pose a significant challenge for the robot to make optimal decisions. In addition, the restricted battery capacity of robots makes efficient environmental exploration essential.
Many researchers have proposed various autonomous exploration methods in recent years, which can be divided into frontier-based methods [11][12][13][14], information-based methods [15,16], and hybrid methods [17]. Different types of maps are applied to autonomous exploration, such as methods [11,[18][19][20] based on occupancy grid map, methods [21,22] based on topological maps, and methods [23] based on feature maps. The frontier-based method is intuitive and efficient, and an occupancy grid map can be used for efficient path 1.
We propose an integrated frontier detection and maintenance method. A complete environmental exploration can be achieved by sufficient frontier detection and incrementally maintaining reachable and informative frontiers.

2.
A multiple path generation method is proposed using the wavefront propagation trend of the fast-marching method and a well-designed velocity field to generate safe paths with a good view.

3.
A multi-object utility function is proposed for frontier evaluation to obtain the optimal path, improving exploration efficiency. A path smoothing method with dynamic parameter adjustment improves the smoothness of the optimal path.
We propose a complete framework for autonomous robot exploration in unknown environments and verify the effectiveness and practicability of the proposed framework through sufficient experiments.
The remaining article is organized as follows. In Section 2, we introduce related work in autonomous exploration. In Section 3, we provide the problem statement. Our proposed method is presented in Section 4. In Section 5, we provide the experiments and results both in simulated and real-world environments. In Section 6, we provide the discussion of this work. Lastly, in Section 7, we provide the conclusion of this work and provide future directions.

Frontier Detection Methods
Yamauchi et al. [13] proposed the concept of frontier-based exploration; however, frontier detection requires the processing of the entire map each time-as the map increases, the frontier detection efficiency gradually decreases. Some researchers have worked on the efficient detection of the frontier. Umari et al. [11] innovatively proposed using global Remote Sens. 2021, 13, 4881 3 of 23 and local rapidly exploring random trees [26] to detect frontiers in 2D and 3D scenes. Keidar et al. [12] proposed efficient frontier detection methods, such as the wavefront frontier detector (WFD) and the fast frontier detector (FFD) but did not consider whether the frontier was reachable. Unreachable frontiers will produce invalid paths, waste computing resources, and make exploration inefficient. The work in [19] further considered the extraction of reachable frontiers and only incrementally processed the modified portions of the occupancy to obtain safe and valid frontiers. The safe and reachable frontier detection generator (SRFDG) [27] uses laser data to efficiently obtain two types of fronters and maintains a global topological map. However, due to the uncertainty of the pose during the mapping process, the frontier cannot always be effectively obtained based on laser data alone. At the same time, this method ignores the timely detection of the changes on the map. Different from the above method, which detects frontiers from a single source, our method uses an occupancy grid map and lidar data to obtain frontiers together. The proposed method only deals with a fixed size active area, which reduces the computational burden. Meanwhile, we propose a method of deleting inefficient frontiers.

Decision-Making Methods
Some methods only consider a single evaluation metric. Yamauchi et al. [13] only considered the distance between the frontier and the robot as the evaluation metric, ignoring the different information acquisition between frontiers. Stachniss et al. [20] proposed an evaluation strategy to evaluate each action that trades off the expected information gain cost and reduces localization uncertainty. To prevent the robot becoming trapped in the cubicle areas, Gao et al. [28] added the consideration of the rotation angle to the evaluation function, allowing the robot to explore as far as possible along the current direction. However, this method increases the complexity of adjusting the parameters. Unlike the above methods, we propose a method that considers multiple metrics for evaluating frontiers with only two adjustable parameters, which uses the distance map to efficiently evaluate the impact of different paths on the mapping.

Path-Planning Methods
Fang et al. [29] proposed an improved dynamic window approach [30] with a new trajectory evaluation function to achieve local exploration. Lauri et al. [31] modeled the robot's autonomous exploration as a partially observable Markov decision process (POMDP) and obtained the most informative local trajectory through forward simulation algorithms with an open-loop approximation. Ding et al. [32] proposed to obtain multiple local trajectories and used information entropy to evaluate multiple trajectories. However, the above methods may have had local minimum problems in some complicated scenarios. Therefore, some of the methods solve the local minimum problem by integrating the frontier-based exploration method [31] or re-planning the global path [32]. Bircher et al. [33] used the RRT method to explore and generate a path. However, the randomness of the RRT tree has poor smoothness and hinders the robot's rapid movement. Some advanced improvement methods such as RRT* [34], qRRT [35], and RRdT* [36] improve the path quality of RRT. A class of common methods [11,24,25] uses the A* algorithm for path planning, sometimes resulting in paths close to obstacles. The fast-marching method [37] is local minimum free and complete and has been widely used in path planning [38][39][40]. The method in [41] performs the fast-marching method to obtain a path on the gradually acquired environment map transformed using the logarithm of the extended Voronoi. However, this work ignored ensuring that the ranging sensor collected valid data when generating the path. When the robot executes such a path, it may affect the quality of positioning. We use the Euclidean distance map to generate a reasonable velocity field to avoid the above problem and generate multiple paths for evaluation. Usenko et al. [42] proposed a trajectory optimization method using the uniform B-spline. Zhou et al. [43] further improved the optimization efficiency. Inspired by the above methods, to facilitate Remote Sens. 2021, 13, 4881 4 of 23 the tracking of non-holonomic constrained robots, we propose a path optimization method with dynamic parameter adjustment to improve the smoothness of the path.
According to the characteristics of our proposed work, we give a comparison with related works in Table 1.

Problem Statement
We first list the common symbols and meanings used in Table 2, then give the problem statement.

M
Occupancy grid map.
f Single frontier.

F w
The frontier warehouse is used to store the frontiers obtained in each exploration cycle.    θ Max The maximum angular deviation that can be followed.
k P , k ϕ , k I The coefficient in utility function, where k ϕ is only related to the physical characteristics of the robot.

D l
Lidar max range.  The confidence range ratio.
T, τ T represents the path which is composed of waypoint τ.

d(·)
Distance function, obtain the distance between any position on the map and the obstacle.
Define S ⊂ R 2 as the environment to be explored. We used the occupancy grid map M to model the 2D environment S. M consists of many grids s i , i ∈ [1, N] and N is the number of grids. The value of s i , i ∈ [1, N] indicates the probability of being occupied. Before the exploration, all the grids are marked as S unknown . During continuous exploration, an increasing number of grids marked S known begin to appear, and S known is divided into occupied grids and free grids, S known = S occupied , S f ree . In this article, the discrete grids located in the frontier area are referred to as frontiers.
Problem: Given occupancy grid map M find the next most promising frontier f with the optimal path T * which is followed by the robot.
The problem is solved repetitively at every exploration cycle. Use M to detect the unexplored frontiers and calculate the corresponding paths to those frontiers. Find the next promising fronter f using the utility function. The robot moves along the optimal path T * attached to f , and updates M with the up-to-date sensor reading online, such as 2D lidar, constantly changing S unknown to S known . The newly generated M is used for the next exploration cycle until the map of the entire environment is completed.

Method Overview
The autonomous exploration framework consists of the frontier processing module, the path planning and evaluation module, and the path tracking module, as shown in Figure 1. The robot provides lidar and odometry data and responds to control commands. The SLAM module simultaneously generates the occupancy grid map M and localizes the robot. The blue rectangles represent the crucial functions. The green dashed rectangles represent the information to be transmitted. The black rectangles represent the external inputs of the system. Algorithm 1 shows the process of autonomous exploration. Each module can interact with external inputs, such as lidar data, localization, and occupancy grid map  . The execution of the exploration cycle is awakened through a timing trigger and a re-planning trigger (line 1). The timing trigger mechanism can adapt to environmental changes. The re-planning trigger mechanism continuously detects the safety of the path and the information acquisition of the target to reduce unnecessary forward distance and ensure that execution of the exploration cycle is awakened through a timing trigger and a re-planning trigger (line 1). The timing trigger mechanism can adapt to environmental changes. The replanning trigger mechanism continuously detects the safety of the path and the information acquisition of the target to reduce unnecessary forward distance and ensure that the path is collision-free. The Bayesian filter and log-odds ratio formulation are used to update the grid map. The Euclidean distance map can obtain the distance between any position and the obstacle on the map, which can be used for autonomous robot exploration. The updated M and a dynamic variant of the brushfire algorithm [44] are used to obtain the Euclidean distance map, where the unknown and free areas are regarded as free (line 2). The frontier detection and maintenance method incremental acquires and maintains frontiers for the subsequent modules (line 3). Then, multiple paths planning and evaluation generates the OptimalPath by evaluating multiple paths (line 7). The OptimalPath is further processed to improve its smoothness (line 8). Finally, the path is tracked by the robot, and the data observed by the robot are also updated (line 9). When there is no effective frontier, the exploration is complete (lines 4-6).

Frontier Detection and Maintenance
As shown in Algorithm 2, this method inputs updated M, robot localization, lidar data, and outputs frontiers set F for path planning and evaluation. F o stores the frontiers obtained from the active area of M and F l stores frontiers acquired using lidar data. The frontier warehouse F w incrementally stores the frontiers obtained during the exploration process and records all the acquired but unexplored areas. Using the lidar detection range, M, and robot location, we can obtain the frontier F o by searching the active area (lines 2,3).
However, some frontiers are too close to the robot, as shown by the red dots in Figure 2b. If these frontiers are selected as targets, the efficiency of robot exploration will decrease. The method of connecting the robot to all frontiers within the scanning range of lidar and performing collision detection is computationally intensive and may remove frontiers that facilitate positioning. The distance map can be updated efficiently by only updating affected parts without updating the entire map. Based on the above considerations, we take advantage of the local update feature of the dynamic distance map to remove inefficient frontiers. The local distance map can be changed by setting the robot as a virtual obstacle, as shown in Figure 2c. The inefficient frontiers within the lidar range can be eliminated by querying whether the nearest obstacle of those frontiers is the robot's position. In addition, the current F o and F w obtained in the previous cycle are processed to remove inefficient and unreachable frontiers (lines 4-7). To improve the efficiency, we use the ray-casting method to simulate lidar to sample the environment and use the Breshmen method [45] to obtain the corresponding grids. When the number of unknown grids around a frontier is less than the threshold, it is considered that there is too little information.

Algorithm 2. Frontier Detection and Maintenance.
Input: M, RobotLocation, LidarData, F w , DistanceMap Output: SetObstacle (RobotLocation, DistanceMap); 5 for . To improve the efficiency, we use the ray-casting method to simulate lidar to sample the environment and use the Breshmen method [45] to obtain the corresponding grids. When the number of unknown grids around a frontier is less than the threshold, it is considered that there is too little information. The lidar data of undetected obstacles are processed to obtain potentially accessible frontiers to accelerate the autonomous exploration of the environment, as shown in Figure  3. The green dotted curve in Figure 3a shows the potential unexplored area  AB which presents the end of the continuous and collision-free laser beam. First, whether the midline of the acquired area  AB can make the robot collision-free is determined. If the above The lidar data of undetected obstacles are processed to obtain potentially accessible frontiers to accelerate the autonomous exploration of the environment, as shown in Figure 3.
The green dotted curve in Figure 3a shows the potential unexplored area AB which presents the end of the continuous and collision-free laser beam. First, whether the midline of the acquired area AB can make the robot collision-free is determined. If the above condition is satisfied, further analysis of the area AB can be performed. After the above processing, if there are unexplored frontiers, we can use the shift clustering method [46] to remove redundant frontiers as it has only one para (bandwidth) and the parameter has a clear physical meaning (line 17). However, the mean-shift method may cause the frontier to be unreachable or contain less mation after clustering. The frontier with the smallest Euclidean distance from the c center in a cluster of frontiers is selected as the output to solve this problem. This m preserves the clustering characteristics as much as possible and ensures that the after the clustering are informative and collision-free. The frontier set F is used fo sequent path planning and evaluation, and w F is updated to achieve complete ma (line 18).

Multiple Paths Planning and Evaluation
The process of multiple paths planning and evaluation is divided into three We use the fast-marching method with a well-designed velocity field to generate mu paths from the robot position to the unexplored frontiers. Then, we evaluate those to obtain the optimal path, which corresponds to the most promising frontier usi proposed utility function. Finally, the smoothness of the path is further improved.

Multiple Path Generation Using Fast Marching
Fast marching is a numerical method for simulating the spread of a wavefront can be represented by the Eikonal differential equation [27]: We can further extract the safe and feasible area A B , as shown in the green line. The red circle indicates the robot's footprint, and d 0 is slightly larger than the robot radius r, and d 1 = 2d 0 . The α 0 is the angular resolution of the lidar. The n is the continuous and noncollision laser beam, and the D l is the maximum detection range of lidar. Using Equation (1) to process the acquired area AB, we can obtain m uniformly distributed frontiers. Then, we can calculate the position of these frontiers in the lidar coordinate system and transform these frontiers into the map coordinate system. Finally, we can keep the information-rich frontiers in F l (lines 8-11).
After the above processing, if there are unexplored frontiers, we can use the meanshift clustering method [46] to remove redundant frontiers as it has only one parameter (bandwidth) and the parameter has a clear physical meaning (line 17). However, using the mean-shift method may cause the frontier to be unreachable or contain less information after clustering. The frontier with the smallest Euclidean distance from the cluster center in a cluster of frontiers is selected as the output to solve this problem. This method preserves the clustering characteristics as much as possible and ensures that the points after the clustering are informative and collision-free. The frontier set F is used for subsequent path planning and evaluation, and F w is updated to achieve complete mapping (line 18).

Multiple Paths Planning and Evaluation
The process of multiple paths planning and evaluation is divided into three parts. We use the fast-marching method with a well-designed velocity field to generate multiple paths from the robot position to the unexplored frontiers. Then, we evaluate those paths to obtain the optimal path, which corresponds to the most promising frontier using the proposed utility function. Finally, the smoothness of the path is further improved.

Multiple Path Generation Using Fast Marching
Fast marching is a numerical method for simulating the spread of a wavefront which can be represented by the Eikonal differential equation [27]: The left side of the equation depicts the function of the arrival time and f (x) represents the velocity of different positions x. The velocity function f (x) determines the propagation velocity at different positions and is the key to the fast-marching method. When f > 0, the wavefront always spreads outward, and the wavefront will only pass through each grid on the map once. We used the two characteristics of the fast-marching method for multiple paths planning.
Characteristic 1: Design an appropriate velocity field to generate a path with good clearance.
The distance map can be transformed into a velocity field using Equation (3). The d(x) is the distance corresponding to position x, and the D l is the maximum detection range of lidar. If the d(x) of a location x on the map is greater than D l or less than the robot radius r, the speed is set to 0. The velocity function ensures the wavefront will not extend to these locations during the path generation process. By simulating a wave expanded from the start point, we can obtain the arrival time of each point on the map. The closer a certain position x to an obstacle, the lower the corresponding wavefront spread speed of the position will be generated, which leads to a later arrival time at the position. We can trace the path from the target point to the starting point along the descending direction of the arrival time gradient and obtain the path with the minimal arrival time. The path is also away from obstacles and ensures that lidar can extract the environmental features to facilitate positioning.
Characteristic 2: The wavefront propagates from near to far to generate multiple paths. The robot is used as the starting point when performing fast marching. When the far frontier is selected as the target, the frontiers near the robot will often be covered as the wavefront expands. Coincidentally, from the perspective of exploration, the robot usually explores from near to far to avoid repeatedly visiting a certain area. Nevertheless, as the map becomes larger, more computing resources will be consumed. Moreover, too many paths to be evaluated also put pressure on the evaluation module. To reduce the computational burden, we formulate the following rules.
1. Select t frontiers closest to the robot in the set F as the set F d , which is the farthest from the robot, as the target point.
3. If multiple frontiers satisfy rule 2, select the closest one to the obstacle as the target point. Finally, frontiers with valid paths are stored in set F s , F s ⊆ F d . Figure 4 shows the process of generating multiple paths. The fast-marching method is executed using the generated velocity field and the target point selected using the above rules. The fast-marching method only expands once to obtain multiple paths, with the time complexity of O(Nlog(N)) and N is the number of free discretization grids on the map. In Figure 4, multiple paths connecting with unexplored frontiers are represented by yellow curves. Proof of Theorem 1. After clustering, all frontiers are located in different grids of map, and there are no overlapping frontiers. The fast-marching method has been proven to be complete [47]. Considering all frontiers in d F , when a frontier is within the extended range of the wave, the path from the frontier to the starting point can be found. Otherwise, there is no path with a frontier that is out of range of the wave. , and the attached path is regarded as the optimal path and represented by * T . The utility function considers information gain, path cost, consistency of robot movement, and valid lidar data acquisition. Therefore, each candidate frontier [1, ] i n ∈ has three attributes: L represents lidar data acquisition cost, I represents the information gain, and P represents the path cost.
1. Lidar data acquisition cost L

Theorem 1.
Each frontier f , f ∈ F s has a valid path.
Proof of Theorem 1. After clustering, all frontiers are located in different grids of map, and there are no overlapping frontiers. The fast-marching method has been proven to be complete [47]. Considering all frontiers in F d , when a frontier is within the extended range of the wave, the path from the frontier to the starting point can be found. Otherwise, there is no path with a frontier that is out of range of the wave. The frontier f , f ∈ F d with a valid path is stored in set F s , ensuring each f , f ∈ F s has a valid path.

Path Evaluation
After path planning, we can obtain the set F s , where m is the number of waypoints. We put forward a utility function to find the most promising frontier f i , f i ∈ F s and i ∈ [1, n], and the attached path is regarded as the optimal path and represented by T * . The utility function considers information gain, path cost, consistency of robot movement, and valid lidar data acquisition. Therefore, each candidate frontier f i {L i , I i , P i }, i ∈ [1, n] has three attributes: L represents lidar data acquisition cost, I represents the information gain, and P represents the path cost.
1. Lidar data acquisition cost L For f i , f i ∈ F s and i ∈ [1, n], we use Equation (4) combined with the distance function d(·) to evaluate the lidar data acquisition quality of the attached path T.
When the distance between the waypoint τ j , j ∈ [1, m] and the obstacle is less than µ · D l , assign a constant value; otherwise, the value will decrease rapidly as the distance increases. µ is set by the user representing the confidence range ratio of the lidar. Take the average of the values of all waypoints to obtain the lidar data acquisition cost L.

Information gain I
The entropy H is an effective tool to describe the uncertainty of the map [20]. The process of exploration is a process of continuously reducing map uncertainty H.
Mutual information I is used to represent the reduced uncertainty after the robot executes a certain path T where p(s i ) represents the occupancy probability of grid s i , i ∈ [1, N] and N is the number of grids.
To improve computational efficiency, the information gain I is only processed at the target point. Furthermore, the ray-casting and Breshmen methods are used to obtain the corresponding grid and calculate the corresponding information gain.
3. Path cost P We can calculate the path length cost P from path T. In addition, in order to maintain the consistency of the motion and avoid back-and-forth maneuver, we calculate the angle θ between the initial part of the path and the orientation ϕ of the robot.
This item has less influence on the exploration process than other items. We design the measurement of the consistency of the motion as a coefficient multiplied by the path cost P. This avoids adding new items to the utility function. When the initial path and the robot direction vector are greater than θ Max , we set the coefficient k ϕ to 1; otherwise, it is set to the user set value (less than 1). This can ensure that even a slightly longer path with an appropriate angle can be selected to ensure the continuity of movement and improve the efficiency of exploration. The value of θ Max considers the actual tracking condition of the robot. Equation (8) gives the complete utility function. The coefficient k ϕ is only related to the physical characteristics of the robot. We need only adjust the coefficients k I and k P to realize efficient exploration of different environments. The information gain I and the path cost P are normalized to remove their unit.
Each time the robot chooses the frontier f i ∈ F s , i ∈ [1, n], with maximum U as the target region to be explored. The selected target region is more informative and near the robot. At the same time, the attached T * of f i can promote the consistency of robot movement and confirm the lidar obtain effective data.

Path Smoothing and Tracking
The path T * generated by the fast-marching method is affected by the environment, and the smoothness of the path needs to be improved. An optimization method of dynamically adjusting parameters is applied to smooth the path to facilitate the further tracking of the robot.
Considering that the B-spline has some characteristics, such as the fact that its derivative is still B-spline, and control points are located inside the convex hull, we use B-spline to represent the path and further optimize its control points. The N+1 control points {P 0 , P 1 , · · · , P N }, P i ∈ R 2 , can define a λth B-spline and knot vector [t 0 , , and a uniform B-spline with each knot span has identical value ∆t. We can obtain the position at the corresponding parameter t using Equation (10) and t is normalized as . M p+1 is the constant matrix determined by λ [48].
Algorithm 3 shows the path optimization process. The function GenerateBspline converts the waypoints into uniform B-spline control points by down-sampling the waypoints while maintaining the waypoints with larger curvature. Considering the improvement of the calculation efficiency, we chose the cubic uniform B-spline, λ = 3. Then, to not change the direction of the initial part of the path, we ensure that the B-spline connects the starting point and the target point of the path. The initial part of the waypoints τ 1 , τ 2 and the target point τ m are repeated λ times and added to the control points {P i }, i ∈ [0, N]. Due to the convex hull nature of the B-spline curve, the obtained B-spline is close to the original path T * .
representing the control points of the first derivative and the control points of the second derivative, respectively.
For a uniform B-spline, the knot span has an identical value, and we only use its numerator of A i , i ∈ [0, N − 2] to measure the smoothness of the B-spline. The BsplinePathOptimization function realizes the optimization of control points. We denote the optimized control points as P o i , i ∈ [0, N]. The Euclidean distance between the P o i and initial control points {P i }, i ∈ [0, N] is used to measure their closeness. Combining the above two items, we can obtain Equation (13).
Use the function PathCheck to detect whether the optimized path has a collision risk or the making robot obtains little environmental information. If the above problems occur, the parameter ϕ 2 is adjusted to make the optimized path close to the original path. After a limited number of adjustments, we can obtain a safe and smooth path T * s to observe environmental information. The optimal path T * is represented by the green line, and the rose-red line is the final smooth path T * s , as shown in Figure 5. . The Euclidean distance between th and initial control points is used to measure their closeness. Combin above two items, we can obtain Equation (13).
Use the function PathCheck to detect whether the optimized path has a collisi or the making robot obtains little environmental information. If the above problems the parameter 2 ϕ is adjusted to make the optimized path close to the original path a limited number of adjustments, we can obtain a safe and smooth path * s T to o environmental information. The optimal path * T is represented by the green lin the rose-red line is the final smooth path * s T , as shown in Figure 5. To achieve a smoother motion than the reactive local planning method, we u nonlinear MPC method [49] to implement the path following of the differential To achieve a smoother motion than the reactive local planning method, we use the nonlinear MPC method [49] to implement the path following of the differential robot. When the robot is close to the target point, we can design the robot's reference speed using a trapezoidal acceleration and deceleration algorithm. Otherwise, the robot's reference speed will be set to its maximum speed. When the angle between the robot's current orientation and the initial part of the path is larger than θ Max , the robot stops and adjusts the orientation to ensure accurate tracking of the path.

Experiment Setup
To verify the feasibility and superiority of our proposed autonomous exploration method, we conducted experiments with RRT-exploration [11] and nearest frontier [13] methods in three common building environments, as shown in Figure 6.
When the robot is close to the target point, we can design the robot's reference speed using a trapezoidal acceleration and deceleration algorithm. Otherwise, the robot's reference speed will be set to its maximum speed. When the angle between the robot's current orientation and the initial part of the path is larger than Max θ , the robot stops and adjusts the orientation to ensure accurate tracking of the path.

Experiment Setup
To verify the feasibility and superiority of our proposed autonomous exploration method, we conducted experiments with RRT-exploration [11] and nearest frontier [13] methods in three common building environments, as shown in Figure 6. RRT-exploration: An advanced exploration method using RRT to detect frontiers has attracted more and more attention from researchers. According to the experimental part of RRT-exploration [11], when the parameter Geta is 4 or 6, the exploration efficiency is better than other parameters, so we use these two different parameters for experimental comparison. We use the author's open-source code [11] for experimental comparison.
Nearest frontier: A traditional and widely used method. To ensure the fairness of the experiment and verify the effectiveness of the proposed multi-objective utility function, the method is implemented by ourselves. The nearest frontier method finds the nearest frontier to the robot every time it makes a decision, and the rest of the modules are the same as the proposed method.
We use a laptop with the specifications Intel i5-8250U CPU, 8G RAM, and 240G ROM for all experimental comparisons. All code is implemented using C++ with ROS Kinetic release on Ubuntu 16.04. We use the stage simulator [50] to build the experimental environment with a modification that, when the lidar does not detect any obstacle, it returns infinity instead of 0. The robot we use in the simulator and subsequent experiments is TurtleBot2. The robot's maximum speed is set to 0.3 m/s, the maximum rotation speed is set to 0.9rad/s, and the origin rotation speed is set to 0.6 rad/s. The mapping method use gmapping [51] with adjusted parameters, and the mapping resolution is set to 0.1 m. Experiments with different lidars are carried out to verify the algorithm's effectiveness, and Table 3 shows the parameters of the different lidars.
The parameters used by the proposed method are as follows. The bandwidth of the mean-shift is set to 0.3. The number of d F is set to 20. For the path optimization, we set 1 ϕ = 0.2, 2 ϕ = 10 and MaxOptTimes = 10. The μ is set to 0.9. We compare the two sets of parameters of the proposed method as proposed ( P k = 1, I k = 1) and proposed ( P k = 2, I k = 1), and k ϕ = 0.8. Max θ is set to / 3 π . RRT-exploration: An advanced exploration method using RRT to detect frontiers has attracted more and more attention from researchers. According to the experimental part of RRT-exploration [11], when the parameter Geta is 4 or 6, the exploration efficiency is better than other parameters, so we use these two different parameters for experimental comparison. We use the author's open-source code [11] for experimental comparison.
Nearest frontier: A traditional and widely used method. To ensure the fairness of the experiment and verify the effectiveness of the proposed multi-objective utility function, the method is implemented by ourselves. The nearest frontier method finds the nearest frontier to the robot every time it makes a decision, and the rest of the modules are the same as the proposed method.
We use a laptop with the specifications Intel i5-8250U CPU, 8G RAM, and 240G ROM for all experimental comparisons. All code is implemented using C++ with ROS Kinetic release on Ubuntu 16.04. We use the stage simulator [50] to build the experimental environment with a modification that, when the lidar does not detect any obstacle, it returns infinity instead of 0. The robot we use in the simulator and subsequent experiments is TurtleBot2. The robot's maximum speed is set to 0.3 m/s, the maximum rotation speed is set to 0.9rad/s, and the origin rotation speed is set to 0.6 rad/s. The mapping method use gmapping [51] with adjusted parameters, and the mapping resolution is set to 0.1 m. Experiments with different lidars are carried out to verify the algorithm's effectiveness, and Table 3 shows the parameters of the different lidars.  The parameters used by the proposed method are as follows. The bandwidth of the mean-shift is set to 0.3. The number of F d is set to 20. For the path optimization, we set ϕ 1 = 0.2, ϕ 2 = 10 and MaxOptTimes = 10. The µ is set to 0.9. We compare the two sets of parameters of the proposed method as proposed (k P = 1, k I = 1) and proposed (k P = 2, k I = 1), and k ϕ = 0.8. θ Max is set to π/3. Figure 7 demonstrates the autonomous exploration process of the robot in the lab environment. The three functional modules in the proposed autonomous exploration framework drives the robot to continuously explore the unknown environment in each exploration cycle. The frontiers processing module detects frontiers based on M and lidar data and uses an improved mean-shift method for clustering. The cyan-colored square represents the frontiers obtained from the frontier processing module. The improved clustering method prevents the occurrence of the situation where the target point is an obstacle. The path planning and evaluation module plans multiple paths connecting the unexplored frontiers represented by a yellow curve. These paths are safe and have a better view. The utility function obtains the optimal path T * by considering multiple metrics represented by a green curve. The complete and timely generation of frontiers and the utility function that considers the consistency of motion can avoid the robot's repeated exploration of an area. The rose-red curve represents the path T * s , and the smoothness of the optimal path is greatly improved. The path tracking module controls the robot to accurately follow the path and continuously move toward the unexplored area, gradually completing the exploration of the environment. As shown in Figure 7, we used a red arrow to present the robot's location and orientation. The "star pattern" means that the robot turns in place. Due to the consideration of the consistency of the robot's motion and timely frontier detection, our proposed method rarely cause the robot to turn in place unless the area in front of the robot has been explored.

Performance Comparison and Result
In this part, we analyze the performance of RRT-exploration (Geta = 4), RRT-exploration (Geta = 6), nearest frontier, the proposed method (k P = 2, k I = 1) and the proposed method (k P = 1, k I = 1) in different environments. As each method with different parameters runs 10 times in each environment, we conducted 150 experiments in three environments. We use Proposed (2,1) to represent our proposed method in which the coefficient k P in the utility function is assigned the value of 2, and the coefficient k I is assigned the value of 1.
Experimental data of different methods with different lidar in three environments are shown in Table 3. The optimal data in each item are marked in bold. We show the exploration process close to the average level in each set of experiments in Figure 8. Figure 9 shows the exploration efficiency of the corresponding exploration process in Figure 8.
represented by a green curve. The complete and timely generation of frontiers and the utility function that considers the consistency of motion can avoid the robot's repeated exploration of an area. The rose-red curve represents the path * s T , and the smoothness of the optimal path is greatly improved. The path tracking module controls the robot to accurately follow the path and continuously move toward the unexplored area, gradually completing the exploration of the environment. As shown in Figure 7, we used a red arrow to present the robot's location and orientation. The "star pattern" means that the robot turns in place. Due to the consideration of the consistency of the robot's motion and timely frontier detection, our proposed method rarely cause the robot to turn in place unless the area in front of the robot has been explored.

Performance Comparison and Result
In this part, we analyze the performance of RRT-exploration (Geta = 4), RRT-exploration (Geta=6), nearest frontier, the proposed method ( P k = 2, I k = 1) and the proposed method ( P k = 1, I k = 1) in different environments. As each method with different parameters runs 10 times in each environment, we conducted 150 experiments in three environments. We use Proposed (2,1) to represent our proposed method in which the coefficient   From the data in Table 3, compared with the benchmark methods, our proposed method reduces the path length by 27.07% (25.5 m) and the exploration time by 27.09% (119.6 s) on average. Moreover, the minor standard deviation indicates that the proposed method is more stable and less affected by the environment. The proposed method with parameter (k P = 2, k I = 1) is more inclined to choose a shorter path in the exploration process. It reduced the path length by 27.57% (25.9 m) and the exploration time by 25.24% (111.4 s) on average compared with the benchmark methods. The proposed method with parameter (k P = 1, k I = 1) is more inclined to obtain environmental information to achieve faster exploration. It reduces the path length by 26.58% (25.0 m) and the exploration time by 28.95% (127.8 s) on average compared with the benchmark methods. Furthermore, to validate the performance of the proposed planning method, we conducted comparative experiments with the advanced optimal path planning method RRT* [34] in a 27 m × 24 m environment with walls and random obstacles. To improve the quality and efficiency of the RRT*, we set the step length to 3 m, and the optimized scope is set to 8 m, considering 10% of the sampling as the target point. The parameters of our path planning method are the same as previously mentioned. We set the planning time limit to 5 s. We conducted four sets of experiments, each with 100 trials, and the results areaveraged as shown in Table 4. The optimal data in each item are marked in bold. The first three groups have a fixed starting point and a fixed target point, and the last group randomly generated 100 effective target points. Figure 10 shows the performance of the different methods. As shown in Table 4, the planning time of our proposed method is shorter than RRT*, and the performance in several sets of experiments is less affected by the environment. Compared with RRT*, the path planning method we proposed has a 2% longer path distance, but the clearance is at least 20% better. In addition, our method satisfied 2 C conti- As mentioned earlier, the nearest frontier method we implemented differs from the proposed method only in the evaluation of the next most promising frontier. Compared with the nearest frontier method in the three environments, our proposed method reduces the path length by 10.68% (8.2 m) and the exploration time by 22.65% (94.3 s) on average, which proves the effectiveness of the proposed utility function. The nearest frontier method ignores information acquisition in the exploration process, which lead to inefficient exploration. For example, the average exploration time of the nearest frontier method in the lab environment is longer than other methods. The RRT method uses Euclidean distance instead of the actual path distance, hindering accurate decision making in complex environments. In the office environment, the narrow gates are not conducive to the expansion of RRT branches. This leads to inefficient frontier detection of the RRT-exploration method, resulting in unnecessary repetitive paths and excessive energy consumption. As shown in Figure 8, the RRT method repeatedly explores the same area in the laboratory and office environments.
The integrated frontiers detection and maintenance method ensures the storage of unexplored areas. A multiple path generation method generates multiple high-quality paths leading to unexplored areas. Then, the proposed utility function evaluates multiple paths, which balances the information acquisition, path length, and the consistency of movement. Finally, the proposed autonomous exploration method makes the robot avoid multiple explorations of one area, as shown in Figure 8. The monitoring of safety and information acquisition of the execution path further reduces invalid exploration path. Figure 9 shows that due to the frontier's sufficient detection and maintenance of the environment, comprehensive multiple paths evaluation, and path smoothing, the proposed method with different parameters can achieve faster exploration than other methods. Furthermore, the proposed method with parameters k P = 1 and k I = 1 makes it easier for the robot to explore the unknown environment, improving the exploration efficiency. Experiments in different environments show that our proposed method has a more stable performance. Compared with other methods, our proposed autonomous exploration method can reduce the uncertainty of the map using fewer path costs and in a relatively shorter time.
Furthermore, to validate the performance of the proposed planning method, we conducted comparative experiments with the advanced optimal path planning method RRT* [34] in a 27 m × 24 m environment with walls and random obstacles. To improve the quality and efficiency of the RRT*, we set the step length to 3 m, and the optimized scope is set to 8 m, considering 10% of the sampling as the target point. The parameters of our path planning method are the same as previously mentioned. We set the planning time limit to 5 s. We conducted four sets of experiments, each with 100 trials, and the results areaveraged as shown in Table 4. The optimal data in each item are marked in bold. The first three groups have a fixed starting point and a fixed target point, and the last group randomly generated 100 effective target points. Figure 10 shows the performance of the different methods. nuity, which means better smoothness than RRT*. The performance of the method in Figure 10 also proves that our method is smoother and safer. The comparative experiments with RRT* show that our proposed planning method can quickly generate a safe and smooth path.

Real-World Experiments
To further verify the practicability and effectiveness of the proposed method, we implement multiple experiments under different parameters in a common corridor environment with many regular and random obstacles placed in the corridor. Figure 11 shows the corridor for the experiment and the equipment for autonomous exploration. The corridor contains three compartments, and the length and width are 25.6 m and 10.3 m, respectively.
We use the turtlebot2 robot, and a 2D lidar is installed on the base with a detection As shown in Table 4, the planning time of our proposed method is shorter than RRT*, and the performance in several sets of experiments is less affected by the environment. Compared with RRT*, the path planning method we proposed has a 2% longer path distance, but the clearance is at least 20% better. In addition, our method satisfied C 2 continuity, which means better smoothness than RRT*. The performance of the method in Figure 10 also proves that our method is smoother and safer. The comparative experiments with RRT* show that our proposed planning method can quickly generate a safe and smooth path.

Real-World Experiments
To further verify the practicability and effectiveness of the proposed method, we implement multiple experiments under different parameters in a common corridor environment with many regular and random obstacles placed in the corridor. Figure 11 shows the corridor for the experiment and the equipment for autonomous exploration. The corridor contains three compartments, and the length and width are 25. We conducted three experiments on the parameters ( P k = 2, I k = 1) and ( P k = 1, = 1), respectively. Table 5 shows the experimental data. The completeness of the mappi is obtained by comparing the area of the map created by a human-operated robot. The results are shown in Table 5 and the optimal data in each item are marked bold. Table 5 shows that the proposed autonomous exploration method with differe parameters can achieve complete mapping, proving the effectiveness of the propos frontier detection and maintenance method. The complete and timely frontier detecti and the utility function-considering the consistency of motion-avoid the problem We use the turtlebot2 robot, and a 2D lidar is installed on the base with a detection range of 5 m. Due to the occlusion of the metal brackets, the lidar's field of view is less than 360 degrees. We use the same laptop to complete all calculations. The experimental parameters are consistent with the simulation environment. In each test, the starting point indicated by the green dot is shown on the left side of Figure 11.
We conducted three experiments on the parameters (k P = 2, k I = 1) and (k P = 1, k I = 1), respectively. Table 5 shows the experimental data. The completeness of the mapping is obtained by comparing the area of the map created by a human-operated robot.
The results are shown in Table 5 and the optimal data in each item are marked in bold. Table 5 shows that the proposed autonomous exploration method with different parameters can achieve complete mapping, proving the effectiveness of the proposed frontier detection and maintenance method. The complete and timely frontier detection and the utility function-considering the consistency of motion-avoid the problem of the robot repeatedly entering the same area during the exploration process. The proposed path planning algorithm combined with efficient path detection can plan multiple collision-free paths in time for evaluation, ensuring the robot's safety during the movement. Using Equation (12) to measure the smoothness of the path, we show the ratio of the smoothed path T * s to the original path T * in the smoothness comparison item of Table 5. The results show that the path smoothing method further improves the smoothness of the path. The path tracking method enables the robot to follow the path accurately and smoothly. The exploration process that produces the shortest path in each set of experiments is shown in Figure 12. In this corridor environment, the exploration experiment with parameters (k P = 1, k I = 1) tends to plan a longer path and complete the exploration in a shorter time and smaller path cost. Although there are more uncertainties in the real environment, the performance of our method is similar to simulation experiments, which proves the effectiveness of the proposed method. The video submitted with the manuscript shows our experimental process (see Supplementary Materials). The exploration process that produces the shortest path in each set of experiments is shown in Figure 12. In this corridor environment, the exploration experiment with parameters ( P k = 1, I k = 1) tends to plan a longer path and complete the exploration in a shorter time and smaller path cost. Although there are more uncertainties in the real environment, the performance of our method is similar to simulation experiments, which proves the effectiveness of the proposed method. The video submitted with the manuscript shows our experimental process(see Supplementary Materials).

Discussion
We propose a complete framework for autonomous robot exploration in unknown environments. Simulation experiments on typical environments demonstrate that compared with the benchmark methods, our framework reduces the path length by 27.07% and the exploration time by 27.09% on average.
(1) Detecting frontiers only based on the map often ignores some exploration areas, especially in environments with long corridors and narrow doors. Our proposed integrated frontier detection and maintenance method can detect frontiers based on map and lidar data. A complete environmental exploration can be achieved by sufficient frontier detection and incrementally maintaining reachable and informative frontiers. In the office environment, there is 47.62% less path length and 38.86% less exploration time than RRTexploration.
(2) The proposed utility function evaluates multiple paths, balancing the lidar data quality, information acquisition, path length, and the consistency of movement. Combining the frontier detection and maintenance method the proposed utility function can avoid the robot's repeated exploration of an area. We implemented the nearest frontier method differs from the proposed method only in evaluating the next most promising frontier. Compared with the strategy of selecting the nearest target point in the nearest frontier

Discussion
We propose a complete framework for autonomous robot exploration in unknown environments. Simulation experiments on typical environments demonstrate that compared with the benchmark methods, our framework reduces the path length by 27.07% and the exploration time by 27.09% on average.
(1) Detecting frontiers only based on the map often ignores some exploration areas, especially in environments with long corridors and narrow doors. Our proposed integrated frontier detection and maintenance method can detect frontiers based on map and lidar data. A complete environmental exploration can be achieved by sufficient frontier detection and incrementally maintaining reachable and informative frontiers. In the office environment, there is 47.62% less path length and 38.86% less exploration time than RRT-exploration.
(2) The proposed utility function evaluates multiple paths, balancing the lidar data quality, information acquisition, path length, and the consistency of movement. Combining the frontier detection and maintenance method the proposed utility function can avoid the robot's repeated exploration of an area. We implemented the nearest frontier method differs from the proposed method only in evaluating the next most promising frontier. Compared with the strategy of selecting the nearest target point in the nearest frontier method, the multi-object utility function we proposed can reduce the path length by 10.68% and the exploration time by 22.65% on average.
(3) Planning a smooth and safe path in a short time is crucial to deal with the uncertainty in autonomous exploration. A multiple path generation method is proposed using the wavefront propagation trend of the fast-marching method and a well-designed velocity field to generate safe paths with a good view. Compared with the advanced optimal path planning method RRT*, the proposed path planning method can quickly generate a path, and the path has only 2% longer path distance, but the clearance is at least 20% better. The proposed path smoothing method with dynamic parameter adjustment improves the smoothness of the optimal path, which satisfies C 2 continuity. Experiments in real environments with random obstacles also show that the proposed method can avoid obstacles in real-time and create complete maps.
However, although the proposed method has made some improvements in autonomous robotic exploration, some limitations still need to be improved. First of all, the path planning and frontier detection method in our autonomous exploration framework does not consider the uneven ground environment. The slope of the ground can change a lidar or visual sensor's line of sight. To solve this problem, a 3D sensor could be added to model the ground and consider the sensor's perspective in the path planning. Moreover, the current framework we propose does not consider the recognition and detection of dynamic obstacles. The development of a recognition algorithm for dynamic objects could solve this problem and reduce its impact on the slam algorithm.

Conclusions
A complete robot exploration framework is proposed and has the following characteristics. The proposed integrated frontier detection and maintenance method realizes the efficient and incremental management of the frontier. The proposed multiple path generation using the fast-marching method and multi-object utility function can promote an evaluation of the next best target, reducing the path length by 10.68% and the exploration time by 22.65% on average compared with the nearest frontier method. The smoothness of the path is further improved, and the non-linear MPC is used to track the path accurately. Simulated experimental studies indicate that our method can autonomously build a precise environment map with 27.09% shorter time and 27.07% short path than compared benchmark methods. The proposed method can safely and efficiently establish a complete map in a real-world environment with regular and random obstacles, proving the practicability and effectiveness of the method.
Future directions include extending the proposed method to 3D space to deal with more complex terrain environments and combining dynamic object detection to achieve autonomous exploration in dynamic environments.