Validation of a Dynamic Planning Navigation Strategy Applied to Mobile Terrestrial Robots

This work describes the performance of a DPNA-GA (Dynamic Planning Navigation Algorithm optimized with Genetic Algorithm) algorithm applied to autonomous navigation in unknown static and dynamic terrestrial environments. The main aim was to validate the functionality and robustness of the DPNA-GA, with variations of genetic parameters including the crossover rate and population size. To this end, simulations were performed of static and dynamic environments, applying the different conditions. The simulation results showed satisfactory efficiency and robustness of the DPNA-GA technique, validating it for real applications involving mobile terrestrial robots.


Introduction
In most of the studies concerning Genetic Algorithms (GAs) encountered in the literature, global or local planning strategies are employed. The former provides optimum routes, at high computational cost associated with a priori knowledge of the environment, while the latter provides suboptimal routes, at lower computational cost and with complete, or almost complete, lack of knowledge concerning the environment [1,2]. Global or local planning can be applied to static and dynamic environments, although in the case of dynamic environments, global planning strategies require the use of external observation devices to periodically transmit the current state of the environment to the robot [3].
Several studies [1,[3][4][5][6][7][8] have described navigation strategies employing GAs, with global planning in which the individuals (or chromosomes) are composed of all the possible routes between the initial and final points. In all cases, a priori knowledge is required of the environment, which is represented using a bidimensional grid. Several of the proposed techniques are specific to static environments [1,[4][5][6][7]9], while the proposal presented in Refs. [3,8] is aimed at dynamic environments, although an external observation device is needed to transmit the state of the environment to the robot at a speed faster than the speed of changes in the environment. Although efficient results have been reported in these earlier studies, three issues need to be highlighted. The first is that the size of the individual is variable and is a function of the length of the route (the greater the complexity of the environment, the greater the length) and the resolution of the grid associated with the displacement of the robot. This can significantly increase the spatial and temporal complexity of the GA, making it unviable for use in limited hardware systems such as microcontrollers (MCUs), digital signal processors (DSPs), and others. The second issue is that it is not always possible to obtain the a priori knowledge of the environment that is important for global planning strategies. The third point is that these global strategies are better suited to static environments, due to the necessity of using external observation equipment for dynamic environments.

DPNA-GA Strategy
To facilitate understanding of the results, this section details the DPNA-GA strategy presented in Refs. [22,23].
It is assumed that the robot possesses a location sensor, which returns its spatial position, p R = x R , y R , and a set of n evenly distributed distance sensors. The navigation strategy based on the DPNA-GA generates a route composed of M local displacement events to reach the final objective, In each m-th displacement event, there is a local objective, p ol (m) = x ol (m), y ol (m) , to which the robot moves.
The selection of the local objective, p ol (m), in each m-th event, is performed by a GA that considers the current position of the robot, p R (m), the distance to the final objective,p o f , and the obstacles detected by the n distance sensors. All the positions, p R (m), already visited by the robot up to the m-th displacement are stored in the vector, p R , expressed by and are also used to optimize the GA, avoiding searches in areas that have already been explored. The algorithm ends when the current position of the robot is the same as the final objective, so that, where is a tolerance factor, or when the number of displacement events exceeds a maximum value, M max . The DPNA-GA can be used in both static and dynamic environments, because at each m-th displacement event there is a new search for obstacles and for a new local objective, p ol (m). The steps processed by the DPNA-GA are presented in Algorithm 1 and are described in detail in the following sections.

Scanning Step
In this step (line 4 of Algorithm 1), the DPNA-GA forces the robot to perform a 360 • scan of the environment around its axis. From this scan, each j-th sensor, in the m-th event, returns a signal, s j (m), proportional to the range, d max , of the sensor, so that (2) where d j is the distance measured by the j-th sensor coupled to the robot.
During the scan, the angular displacement, α, can be expressed by where n is the number of distance sensors and p − 1 represents the number of angular displacements that the robot can make on its axis, with the aim of decreasing the resolution and hence requiring a small number of sensors. At the end of the scan process, the DPNA-GA generates a polygon, called the delimiting polygon (DP), composed of a set of K points, expressed by the vector where K = p × n and (x k , y k ) represents the in-plane coordinates of the k-th point associated with the DP. This polygon is used to delimit the search space associated with the genetic algorithm, such that the points (individuals) generated within the polygon are more suitable than points generated outside it. Meanwhile, it is not only the fact of being within or outside the polygon that defines the suitability of each individual. Also considered are the distances between the point generated and the obstacles detected in the scan, among other factors. Figure 1 illustrates the polygon generated by the DPNA-GA for the case of n = 4 and α = 10 • .

Detection of Obstacles
The scan step is followed by initiation of the step for detection of the obstacles (line 5 of Algorithm 1). In addition to the DP, a virtual polygon (VP) is generated that describes a circumference centered on the position of the robot (p R ), with radius r PV , slightly less than the range of the sensors, d max , such that where η is a factor limited to the range 0 < η ≤ 0, 1. The objective of the VP is to detect only those points of the DP that are associated with obstacles, here denoted p O . Hence, after this step, a new set of L points is generated, represented by the vector and L ≤ K. The function f ed (·, ·) calculates the Euclidean distance between any two points, which can be expressed by

Local Objective Search
In this step (line 6 of Algorithm 1), the proposed navigation strategy employs a GA to find a possible local objective, p ol , to which the robot will move. For each m-th displacement event, the GA is executed with a new population for H generations. The individuals are characterized by the vector where p GA j (h, m) represents the j-th individual of the population of size J, associated with the h-th generation of the m-th displacement of the robot. In each generation, h, all the individuals are generated according to the nonlinear restriction expressed by and This restriction limits the individuals of the population to a circumference with radius r d , centered on the position of the robot at the m-th instant, p R (m). Usually, DP occupies most of the circle with radius r d , so that a few individuals are created outside DP. Thus, using the coordinates of DP as constraints on population creation would result in a much more complex creation routine, with few practical compensations.
The evaluation function associated with the j-th individual of the h-th generation in the m-th displacement is expressed by where is the Euclidean distance between the j-th individual of the h-th generation and the final objective, p o f , such that and d o j (h, m) is the shortest Euclidean distance between the j-th individual of the h-th generation and all the L obstacles encountered, which can be expressed as The variables β(m), C j (h, m) and A j (h, m) can be considered as penalty factors added to each j-th individual of the GA. If no obstacle is encountered in the m-th displacement event (L = 0), it is assumed that the optimum evaluation function is simply d Starting from the principle that the circumferences with radius r d , centered in the vector of the center, p R , are areas that have already been visited, the penalty vector, C j (h, m), can be characterized as follows where Z is a relatively large number. Hence, if an individual, p GA j (h, m), is located within any of the m circumferences of radius r d , centered in the vector of the center, p R , it will be positively penalized, reducing its chances of selection. Finally, the penalty, In this last case, individuals that receive this penalty will have little chance of surviving to the next generation. Figure 3 illustrates the calculation of the evaluation function for a j-th individual, p GA j (h, m).  The evaluation function, presented in Equation (13), follows the same principle as the potential fields technique [24], in which d o f j (n, m) (the Euclidean distance between the j-th individual and the final objective, p o f ) represents a traction force to the final point, and 1 d o j (n,m) (the smallest Euclidean distance between the j-th individual and all the points associated with the obstacles) represents the greatest force of repulsion between the j-th individual and all the obstacles encountered. At the end of H generations, the point with the smallest evaluation function is selected as the local objective, p ol (m), associated with the m-th displacement event.

Displacement
The displacement step (line 7 of Algorithm 1) involves movement of the robot to the local objective found in the previous step. After the movement, a new center point, p R (m + 1), is generated, expressed by where is an allowed tolerance in relation to the local objective. This tolerance is essential to the robot with restricted movements, such as nonholonomic robots [24] and errors from real measures. Figure 4 shows a sequence of M = 6 displacements to the final point, p o f .

Simulation Results
To validate the functioning of the DPNA-GA (for static and dynamic environments), considering its robustness in terms of the genetic parameters, simulations were conducted using two types of environment (A 1 and A 2 ), varying the number of generations (H), the size of the population (J), and the crossover rate (R c ). The simulations were performed in MATLAB, using the updated version of the iRobot Create toolbox [25,26]. The toolbox simulated a circular nonholonomic robot with variable action and four distance sensors spaced at 90 • . Table 1 presents the fixed parameters used in the simulations. Each simulation (Figures 5-12) was executed ten times, and the results are associated with the average of the values obtained in all executions. The individuals used the real number encoding method, the crossover operator used the intermediate scheme where the offspring (p GA i (h + 1, m) and p GA v (h + 1, m)) are chosen using the uniform random number, that is, and where the p GA l (h, m)r j (h, m) and p GA k (h, m)r j (h, m) are individuals chosen from p GA (h, m) in selection step and r(h, m) is a uniform random number between 0 and 1. Using Equation (9), Equations (20) and (21) can be rewritten as and, y GA v (h + 1, m) = y GA l (h, m)(1 − r(h, m)) + y GA k (h, m)r(h, m).
As the mutation operator, it was used the Gaussian mutation operator expressed as where the g(h, m) is the Gaussian random variable of median zero and variance σ, N(0, σ), associated of the h-th generation of the m-th displacement of the robot. Using Equation (9), Equation (26) can be rewritten as and, where the g x (h, m) and g y (h, m) are the Gaussian random variable of median zero and variance σ, N(0, σ), associated of the x and y coordinates, respectively. In all simulations, it was used variance, σ = 1. Eight simulations were made, four for environment A 1 (simulations S 1 , S 2 , S 3 , and S 4 ) and four for environment A 2 (simulations S 5 , S 6 , S 7 , and S 8 ). For each simulation, Tables 2 and 3 show the data for the length of the route, c p , travelled by the robot (in meters), the processing time associated with all the displacements along the route, t p (in seconds), and the number of displacement events, M. The displacements of the robot in each simulation are illustrated in Figures 5-12, where the route is indicated by a continuous black line, the m displacement events are shown as circles along the route lines, and the DPs associated with each displacement are indicated by dashed blue lines. The simulations were performed using a computer with a 64 bits CPU (Intel(R) Core(TM) i5-3210M), 2.5 GHz clock speed, and 8 GBytes of RAM.
During the development of the work, several combinations of GA parameters were tested. How the target of the proposed method (DPNA-GA) is for embedded systems with low processing (such as microcontrollers), it was measured the time processing in seconds per displacement event (s/disp), t d , for all simulations. After that, it was observed that the combinations with high population size (J > 30), large generations number (H > 30) and low crossover rate (R c < 60%) achieved high values of td (t d > 3 s/disp). However, values of t d > 3 s/disp can reduce the continuity of movement associated with the robot. Thus, the eight simulations (S 1 to S 8 ) were chosen using the criterion of t d < 2.5 s/disp (time processing in seconds per displacement event).
The size of the population at J = 10 is a relatively low amount for the GA standards, and increasing that amount generally implies a slight improvement in GA convergence values, but a considerable increase in processing time in environments with many obstacles. A similar situation is the one that concerns the crossover rate. Lowering the crossover rate, R c , from 60% to close values or increasing the crossover rate, R c , from 80% to close values, the results did not show significant differences. Lowering from 60% crossover rate to distant values has led to bad results that were already expected. Based on these results, we decided to include in our analysis only the most significant ones.
In the case of environment A 1 (results shown in Table 2 and Figures 5-8), it can be seen that the navigation strategy showed little variation in terms of the number of displacement events, m (mean of 19.5 and standard deviation of 2.5), and the length of the route, c p (mean of 18.1 m and standard deviation of 1.67 m). However, greater variability was found for the processing time (mean of 28.92 s and standard deviation of 12.88 s).
Comparing simulations that have the same crossover rate (simulations S 1 and S 2 , and simulations S 3 and S 4 ), it can be observed that the ones with a larger population size (simulations S 1 and S 3 ) have slightly higher performances in terms of route size and number of displacements. However, this little increase in performance does not compensate for the increase in processing time, which is more than twice its counterparts are. On the other hand, when comparing simulations with the same population size (simulations S 1 and S 3 , and simulations S 2 and S 4 ), the increase in crossover rate meant a smaller route size and less displacements, with a less significant increase in processing time than in the previous comparison.
Lowering the crossover rate led to a more elitist configuration of the population, letting more individuals continue unchanged in the next generation. For a routing problem, reaching a few points in the search space can lead to a poor convergence of the algorithm. A larger population increases the variability of solutions reached in the search space, as can be seen in simulation S 3 (Figure 7), but it does not quite compensate for the lower crossover rate in this case. The simultaneous decrease of these parameters of genetic variability, as shown in simulation S 4 (Figure 8), leads to the worst convergence of the algorithm among the simulations, causing the robot to do bad and/or unnecessary displacements.
Finally, another important point to emphasize is that reduction of the size of the population only slightly increased the number of displacements (from 17 to 19) and greatly reduced the total time associated with the displacements (from 39.31 s to 15.64 s).   In the simulations using environment A 2 (S 5 , S 6 , S 7 , and S 8 ), the robot encountered a dynamic obstacle (red rectangle) and had to avoid it. The results of these simulations are shown in Table 3 and Compared to the findings for environment A 1 , environment A 2 showed poorer results, with a crossover rate of 80%. This could be explained by the simplicity of environment A 2 , relative to environment A 1 , which did not require a high rate of renewal of the individuals of the population.
The data shown in Tables 2 and 3 demonstrate that the execution time of the DPNA-GA is much shorter than for the strategies presented previously [6] for an environment similar to A 1 . This difference is mainly associated with the size of each individual, the number of generations, and the size of the population, which in the case of the DPNA-GA were limited to 2, 30, and 30, respectively. The proposal described in Ref. [1], for example, employed populations of up to 2000 individuals with size of around 140 values (in the best case), for an environment similar to A 1 . In other work, a population of 50 individuals was used, together with 2000 generations [11].

Comparison with Other Approaches
To compare the results with other works in the literature, the DPNA-GA was simulated with environments used in works presented in Refs. [9,13,14,17,21]. Figures 13-16 show the displacement of the DPNA-GA in the environment proposed in Refs. [9,13,14] and [17,21], respectively. Table 4 shows the parameters and the results of the DPNA-GA in the environments shown in the Figures 13-16. Finally, Table 5 compares the result between the DPNA-GA and the literature works presented in Refs. [9,13,14,17]. Each simulation (Figures 13-16) was executed ten times, and the results are associated with the best cases. The genetic parameters for DPNA-GA were J = 30, H = 10 and R c = 80%, corresponding to the best configuration found in the simulations with static environment (see Table 2, S 2 ). The techniques compared were the Improved Genetic Algorithm (IGA) [9], the Matrix-Binary Codes-based Genetic Algorithm (MGA) [13], the Ant Colony Optimization (ACO) [14], the ACO with the Influence of Critical Obstacle (ACOIC) [14], the Firefly algorithm [17] and the Fuzzy System [21].    The comparative results in Table 5 show that the DPNA-GA had a route length gain in most cases. Table 6 presents the route length saved by DPNA-GA for works [13,14,17,21]. For research presented in Ref. [9] the DPNA-GA had a slightly worse result (<10%) however, it is important to emphasize that the work shown in Ref. [9] uses a GA navigation strategies with global planning in which each individual (or chromosome) is coded as a possible routes between the initial and final points and this increase the chromosome size and GA processing. The DPNA-GA uses the fixed chromosome size regardless of route length.  Firefly 37.28% Reference [21] Fuzzy 40.00%

Conclusions
The objective of this work was to validate the robustness of a dynamic planning navigation technique for mobile terrestrial robots, based on genetic algorithms, denoted DPNA-GA. The validation was performed by varying some of the genetic parameters, in two different types of environment. Starting with strategies described in the literature as a basis, the DPNA-GA comprises a navigation scheme with local planning (applied to static and dynamic environments), in which the environment is unknown a priori and the size of the individuals is independent of the complexity of the environment. This property is fundamental from the point of view of practical implementation. The simulations showed that the DPNA-GA provided viable route solutions for different types of environment, following changes in the genetic parameters, hence demonstrating robustness at a relatively low cost, compared to other global and local planning strategies.