Collaborative Multi-Robot Formation Control and Global Path Optimization

: For multi-robot cooperative formation and global path planning, we propose to adjust the repulsive ﬁeld function and insert a dynamic virtual target point to solve the local minima and target unreachability problems of the traditional artiﬁcial potential ﬁeld (APF) method. The convergence speed and global optimization accuracy of ant colony optimization (ACO) are improved by introducing improved state transfer functions with heuristic information of the artiﬁcial potential ﬁeld method and optimizing the pheromone concentration update rules. A hybrid algorithm combining APF and improved ACO is used to calculate an optimal path from the starting point to the end point for the leader robot. A cooperative multi-robot formation control method combining graph theory and consistency algorithm is proposed based on path planning of the leader robot. Taking AGVs in a logistics distribution center as an example, the feasibility of the improved algorithm and its effectiveness in solving the multi-robot path planning problem are veriﬁed.


Introduction
In the context of the continuous development and improvement of artificial intelligence theory, mobile robots have become more intelligent and autonomous, capable of replacing or assisting humans to complete dangerous, complex, and boring work. Therefore, mobile robots are widely used in industrial production, military operations, medical services, construction, and other fields. As a crucial technology for mobile robots, navigation technology fundamentally affects the path planning capability of mobile robots, determines whether they can identify and avoid obstacles, and calculates the best route to complete the task in the current working environment, quickly and accurately [1]. With the rapid development of various industries, single robots are no longer able to meet the demands of enterprises in terms of production, cost, and service efficiency, etc. Therefore, people are improving the performance of single robots and working on the research of multi-robot formation control to accomplish tasks that cannot be done by a single robot.
The working environment of mobile robots is divided into static and dynamic [2]. In the former, the start and end points and obstacles are known and unchanged; in the latter, the moving path has to be calculated in real time based on the feedback information from sensors, since the start and endpoints and obstacles are dynamically changed. Path planning is further divided into local and global, depending on how much information the robot has about the environment. For the former, the environmental information is opaque, and the robot acquires information through sensors such as the position and geometric properties of obstacles, which are processed and computed at high speeds to perform obstacle avoidance. In the case of the latter, a path for the robot is planned in a known environment, and its accuracy depends on the degree of environmental information acquisition. The aim of this study is the application of multi-robot formations in global path planning in a static environment.
Many scholars focus on mobile robot path planning, and, so far, a large number of control optimization methods have been proposed. These include the Dijkstra algorithm [3], genetic algorithm [4], firefly algorithm [5], A* algorithm [6], artificial potential field method [7], RRT algorithm [8], neural network algorithm [9], etc. However, due to their own defects, these heuristic algorithms often cannot complete the task satisfactorily. For example, it takes the A* algorithm a long time to find a better path by traversing all reachable nodes [10]. The RRT algorithm is overly stochastic and has a high number of redundant nodes and path inflection points [11].
The study of heuristic algorithm improvement strategies for path planning problems has become popular nowadays. The artificial potential field method is one of the classical algorithms applied to path planning, but suffers from the defects of not being able to reach the target point and easily falling into the local optimum [12]. In response, scholars have taken a series of measures to address the above issues. Ulises et al. [13] proposed a membrane-evolving APF to find parameters that can generate feasible safe paths. Bence et al. [14] have extended the APF through the motion characteristics of animals to enhance the human-robot interaction. Sudan et al. [15] proposed an improved APF for solving the obstacle avoidance problem of multiple robots. Hou et al. [16] introduced the distance between the robot and the target point into the repulsive field function, avoiding the local optimization problem.
The ACO is also a classical path planning method with the advantages of good feedback information and distributed computation [17]. However, due to the limitation of its own mechanism, the computational overhead is large and the convergence speed is slow in the process of finding the optimal path [18]. Viswanathan et al. [19] used the sigmoid function to change the ant colony pheromone update mechanism and reduce the computational cost. Tao et al. [20] introduced key obstacle influence factors into the heuristic information, which changed the pheromone volatility rate and improved the convergence speed of the ACO. Luo et al. [21] constructed the initial pheromone with unequal allocation and introduced a dynamic penalty method to improve the ACO. Han et al. [22] proposed an ACO with faster convergence and better obstacle avoidance. Jiao et al. [23] improved the search speed of the algorithm using multiple ant colony partitioning and a cooperative mechanism. Liang and Wang [24] combined GA with ACO to improve the robustness and the solution quality. Gao et al. [25] proposed a new pheromone diffusion gradient formula to accelerate ACO.
The above improvement strategies were optimized for the disadvantages of APF and ACO, respectively, but the trade-off between path optimization and computational cost still faced a tough choice. Some scholars started to combine the advantages of ACO and APF to find a new solution strategy.
Liu et al. [26] optimized the initial pheromone of ACO with an improved APF algorithm to improve the directionality of ants in the early stage of path search. Chen et al. [27] combined APF and ACO to improve the convergence of the algorithm. Liu et al. [28] used APF to optimize the pheromone matrix of the ACO to improve the algorithm performance and search efficiency. Park et al. [29] used the improved APF ensemble to optimize the setting of heuristic information and volatile factors of ACO to avoid the problem of falling into a local optimum. Although the above studies have improved the defects of the single algorithm to some extent, APF-ACO still has the problems of large computation and poor performance of the optimal solution due to the influence of APF and ACO's own mechanism [30].
Therefore, a new APF-ACO is proposed in this paper, which improves the convergence speed and the optimization-seeking accuracy of the basic algorithm and can solve the robot path planning problem as well. However, a single path planning algorithm does not enable robots to form effective formations and is less controllable, and multi-robot formation path planning has also attracted the attention of many scholars.
Without considering information delay, Peng et al. [31] investigated distributed formation control of multiple mobile robots based on graph theory and Lyapunov theories. Pan et al. [32] proposed a virtual spring path planning and formation control method for multirobot systems. Wu et al. [33] investigated an observer-based approach to robot formation control. Doaa et al. [34] combined graph theory with ant colony algorithms and applied them to the study of path planning for robot formations. Liu et al. [35] applied a third-order control protocol to implement the control of robot formations. Wang et al. [36] proposed an optimization model with multi-stage consistency improvement to solve the problem of decision making in the absence of knowledge. Henrik and Eberhard [37] combine algebraic graph theory with distributed control to study the robot formation framework. Yan et al. [38] proposed a formation consistency control method for mixed-order systems with switching topology to improve the stability of formation control.
The studies mentioned above have investigated the construction and control of robot formations using methods such as graph theory and consistency control, but relatively few studies have combined the two. In this study, we use a graph-theoretic approach for robot formation construction and a consistency control approach for maintaining the formation.
In summary, this paper investigates the global path planning problem for multi-robot formations in static environments. The main contributions are as follows: (1) adjusting the repulsive field function and introducing dynamic virtual target points to solve the problem that APF is prone to local optimum and target unreachability; (2) changing the concentration of ant colony pheromone and improving the state transfer function with the improved APF repulsive field function to improve the search efficiency and solve the problems of local optimum and slow convergence; (3) using graph theory and the consistency control method to build the formation of robots and improve the stability of formation control.
The remainder of this article is organized as follows. Firstly, the theoretical foundations and hypotheses are presented. Secondly, our research design and methodology are described. Following that, the analysis, results, and implications of our findings are discussed. Furthermore, an application of the improved ACO is shown with the case study on the AGV formation path planning. Finally, conclusions and directions for future research are presented.

Classical Artificial Potential Field Method
In the APF, the obstacle and the target point generate repulsive and gravitational potential fields, respectively, which act simultaneously on the robot to make it move safely from the starting point to the target point without collision.
The robot path planning problem is studied in two dimensions, and the robot is regarded as a particle. Assuming that its position in the potential field is W 1 = (x, y), then the position of the target point is W 2 = X N = [x n , y n ], and the gravitational force on the robot is as follows: where k g > 0 is the gravitational gain coefficient; α 1 and ε 1 represent the unit vector and linear distance of the robot from the target point in the potential field, respectively, where ε 1 = X − X n . The robot is subject to the repulsive force function of the obstacles in the environment as follows: where k z > 0 is the repulsive gain coefficient of the function; µ is the distance between the robot and the obstacle; µ 0 is the range of the obstacle, µ ≤ µ 0 .The direction and magnitude of the robot's motion in the potential field depends on the combined force F(X):

Repulsive Potential Field Function Adjustment
To solve the defect that the target is not reachable during the robot movement, the repulsive potential field function is now adjusted as Equation (4).
where ρ(X, X 0 ) is the Euclidean distance between the robot and the obstacle, η is the repulsion gain coefficient, and n is the regulation constant. The negative gradient of the repulsive field function can be obtained according to Equation (4), and the improved APF repulsive function is as follows: From the above equation, the repulsive force F rep1 (X) decreases to 0 when the robot moves toward the target point; the gravitational force F rep2 (X) gradually increases when 0 < n < 1. This ensures that the robot can reach the specified position even if there is an obstacle near the target point. In this case, the combined force on the robot is given by Equation (9).
where m is the number of effective obstacles. Assuming that the combined force F is at an angle θ with the x-axis, and k is the coordinate point of path, the next coordinate (x k+1 , y k+1 ), at which the robot moves, is calculated as follows, where l is the position shift length of the robot: x k+1 = x k + l · cos θ y k+1 = y k + l · sin θ (10)

Add Virtual Target Points
The step length of the robot is set to 5. When the consecutive step length is less than βl, β ∈ [2, 4], the robot is judged whether it falls into local optimum by adjusting β. When the robot falls into the local optimum, the obstacle information is stored. Take the line between obstacles and destination as the center, and the side with more obstacles is the effective obstacle. The position of the virtual sub-target is calculated according to Equation (11): where (x, y), (x g , y g ) describe the current positions of the robot and the target point, respectively, (x b , y b ) is the position of the effective obstacle, and β is the adjustment factor. By continuously changing the virtual target point, the robot jumps out of the locally optimal trap and then withdraws the virtual sub-target, allowing the robot to continue intermittent route planning under the original combined force.

Classical Ant Colony Algorithm
The ACO achieves optimal search of paths through information probability selection and pheromone identification between ant colonies and their environment. The probability of the kth ant choosing the next node location at the intersection is as follows: , j ∈ allowed k 0, otherwise where allowed k is the kth ant selectable node, α is the pheromone heuristic factor; β is the desired heuristic factor, τ ij is the pheromone concentration of path (i, j), η ij is the heuristic information, and d ij is the Euclidean distance between the current node i and the node j to be selected. The pheromone update rule is as follows: where ρ ∈ (0, 1) is the pheromone volatility coefficient, ∆τ ij (i, j) is the pheromone increment from node i to node j, Q is the pheromone intensity; L k is the path length of the kth ant in one cycle, and P k (begin, end) is the path taken by ant k from the starting point to the end point.

Improved Pheromone Concentration Update Rules
When an ant passes a node, the pheromone change rule at that point is as follows: where λ is the local pheromone volatility coefficient, λ ∈ (0, 1); τ 0 is a very small constant; and by making τ ij (t) equal to the limit value, the algorithm will avoid stagnation when τ ij (t) exceeds the limit value. The update rules for pheromone concentration according to the quality ant update is as Equation (17): where ε is the introduced parameter, ε ∈ (0, 1), and L worst and L best are the lengths of the paths taken by the worst and best ants, respectively. If the algorithm falls into a local optimum for a consecutive iteration T, it is automatically adjusted by the volatility factor ρ, as follows: where γ is the global pheromone decrement factor, γ ∈ (0, 1), and ρ min is the minimum pheromone volatility factor.

Improved State Transfer Function
The basic ACO uses a uniform distribution of initial pheromone concentration, which leads to a slow convergence rate. Therefore, in this paper, a higher initial value of pheromone concentration is determined with an improved APF, and then the next path point direction is determined with virtual ensemble to improve the accuracy of planning direction. After introducing the heuristic information of the improved APF, the new state transfer function is shown in Equation (19): After establishing the grid model and setting the environment information, the flowchart of the operation of the improved ant colony algorithm is as shown in Figure 1.

Parameter Setting
In order to verify the performance and the improvement effect of the improved ACO in this paper, a comparative simulation with the traditional ACO in two raster environments with different levels of complexity is performed.
A reasonable combination of parameters in the ant colony algorithm affects the optimal solution of path planning, and the values of each parameter are currently determined mainly by simulation experiments and empirical selection of multiple parameter combinations for comparison. In this paper, five different values are selected in the range of the main parameter variations according to empirical methods. A set of default parameter values is set as α = 1, β = 6, and ρ = 0.1. To improve the rationality, different parameter combinations were set up by the single-variable control method, and 10 simulations were performed for each parameter combination in a 10 × 10 raster environment. The mean values of the simulation results were compared to select the best parameter combination.
The experimental results in Table 1 yield a range of optimal values for each major parameter, with α optimal around 1, β optimal around 6, and ρ optimal around 0.1. The specific experimental parameters were set as shown in Table 2, and this combination of parameters was followed in the subsequent study. In this experiment, Matlab was used as the experimental platform, and 10 × 10 scale simple and complex grid obstacle environments were set respectively. The related parameter settings are shown in Table 2. The image shows that the obstacle distribution in simple environment is less, and the optimal path can be obtained before and after the ant colony algorithm is improved. However, a comparison between Figures 2 and 3 shows that the improved optimal path is smoother and shorter, the path length is 12.7280, and the algorithm tends to stabilize faster by 46.4%.
As shown in Figures 4 and 5, the search difficulty increases as the density of obstacle distribution in complex environment increases, and the path branch points become more frequent. The improved ACO still performs better than the classical ACO with the shortest path of 14.4854, and the speed of algorithm convergence to stability is improved by 41.4%. The above experimental results show that the improved ACO based on artificial potential field has significant advantages in the speed, and the results of optimal path search compared with the traditional ACO.    The comparison of the results before and after algorithm improvement in two environments are shown in Table 3.

Graph Theory
The robot formation problem can be viewed as how the graph topology interacts with the local feedback control protocols of the nodes to generate the overall behavior of the interconnected nodes. The topology diagram can be defined as D(V, E), where V = {v 1 , v 2 , . . . , v N } denotes the set of N nodes; E is the set of edges. Assume that the graph is a simple directed graph (v i , v j ) / ∈ E and there are no redundant edges between the same nodes. The element (v i , v j ) in the set E in the formation represents the robots whose information flows from node v j robot to node v i . The out-degree d o i of a node robot represents the number of edges with it as the data sender, and the in-degree d i represents the number with it as the data receiver.
Determine the weights of the edges as a ij , and then represent the graph as a matrix Next, a diagonal matrix D = diag d i and a Laplace matrix L = D − A are defined. The eigenvalues of L are analyzed and the Jordan form of the Laplace matrix is defined as follows: where the Jordan Matrix J and the transformation matrix M are where eigenvalues λ i and the right eigenvectors v i of M satisfy (λ i I − L)v i = 0, and I is the unit matrix. The left eigenvector w i of M satisfies w i (λ i I − L) = 0.

Formation Control Based on Consistency Method
The formation structure in this study is a distributed control structure in a navigatorfollower formation, and the formation control is achieved through the combination of graph theory and consistency algorithms. The first-order continuous system model of the formation system is as Equation (22).
x i , u i denote the state quantity and the input quantity of the node, respectively, x i , u i ∈ R n , and n denotes the dimension of the state quantity. Ideally, the control inputs are as Equation (23): where a ij is the adjacency matrix element of the formation and N i is the set of neighbors of member i. After introducing Equations (22) and (23) into the global state vector x = [x 1 , x 2 , . . . , x n ] T ∈ R n , the global dynamic relations are obtained as follows: Equation (24) shows that the closed-loop dynamic properties of the formation depend on the matrix L. The -L eigenvalues are all located in the left half-plane of the complex plane when and only when the topological graph G of the formation has a spanning tree, so that Equation (23) ensures that the system achieves consistency and the final state values are as follows: Equation (22) can be discretized as: In this study, assuming that the navigator leads the other robots from the starting point towards the target point, ideally the algorithm for the followers in the formation takes the following form: where ε > 0, r ij denotes the relative positions of robot i and robot j. The control equation for leader N is: (28) m and k are constants, and D(K) is the distance between the leader and the target point at moment k. The velocity is affected by the neighboring nodes and the distance from the target point.

Formation Control Simulation Analysis
According to previous research, this section sets up a single leader-single follower structure formation, and the formation is formed with the following robot in the coordinate system of the leader robot, D = (−20, −30). Other parameters are set as follows: ∆ = 10, u d 1 = 1.5, χ(·) = 3 π arctan(·). Figure 6 illustrates the simulation results. The formation control is stable during the robot moving to the destination, and, while there is a little bit of error in the change of direction phase, it is within the controllable range.

Application to the AGV Case
This section applies the above strategies to logistics company A, uses the grid method to discretize the environmental information of its warehouse, sets the shelves and other facilities as obstacles, configures the environmental information according to their locations and sizes, and uses the improved ant colony algorithm to plan a shortest path from the starting point to the end point for the AGV formation. For the leader AGV, start point coordinates are (1.8, 48.5), and end point coordinates are (49.5, 0.5). For the follower AGV, start point coordinates are (1.5, 49.5), and end point coordinates are (49.5, 1.5). The simulation result of single leader formation is shown in Figure 8.
As for the single leader-two-follower AGV shown in the Figure 9, the starting point positions of the leader AGV and follower AGV 1 are the same as above, and the starting coordinates of the follower AGV 2 are (0.5, 48.5).  The simulation results describe that no matter singleleader-single follower or singleleadertwo-follower, AGV formation can be maintained well and reach the end position smoothly without collision.

Conclusions
In response to the local minima and target unreachability problems of the traditional APF, the improvement measures of improving the repulsive field function and inserting dynamic virtual target points are proposed. The improved artificial potential field repulsive function is introduced into the ACO, and the state transfer function is optimized to enhance the global convergence speed of the ACO and solve the problem of the original algorithm easily falling into the local optimum. Establishing 10 × 10 raster maps as the working environment for robot path planning, the optimal combination of parameters is derived through extensive experiments. The simulation results show that the improved ACO not only finds the optimal path for the leading robot, but also the path length and number of iterations are better than the traditional ACO. Finally, the combination of simulation verification based on graph theory and consistent formation control method finds a better path for multi-robot cooperative driving.
In future research, we will optimize the above methods more deeply for application to path planning problems with higher dimensionality and more complex environments.