Research on the Multi-Robot Cooperative Pursuit Strategy Based on the Zero-Sum Game and Surrounding Points Adjustment

: Making full use of the cooperation of multi-robots can improve the success rate of apursuit task. Therefore, this paper proposes a multi-robot cooperative pursuit strategy based on the zero-sum game and surrounding points adjustment. First, a mathematical description of the multi-robot pursuit problem is constructed, and the zero-sum game model is established considering the cooperation of the pursuit robots and the confrontation between the pursuit robots and the escape robot. By solving the game model, the optimal movement strategies of the pursuit robots and the escape robot are obtained. Then, the position adjustment method of the pursuit robots is studied based on the Hungarian algorithm, and the pursuit robots are controlled to surround the escape robot. Based on this, a multi-robot cooperative pursuit strategy is proposed that divides the pursuit process into two stages: pursuit robot position adjustment and game pursuit. Finally, the correctness and effectiveness of the multi-robot cooperative pursuit strategy are veriﬁed with simulation experiments. The multi-robot cooperative pursuit strategy allows the pursuit robots to capture the escape robot successfully without conﬂicts among the pursuit robots. It can be seen from the documented simulation experiments that the success rate of the pursuit task using the strategy proposed in this paper is 100%.


Introduction
A multi-robot system consists of more than two robots, which can improve the efficiency of task completion with cooperation [1]. It has good scalability to adapt to different tasks by adjusting the number and types of robots in the system [2,3]. Multi-robot systems can replace humans to complete tasks such as information collection [3,4], target pursuing [3][4][5][6], target capture [7,8], etc. There is usually a certain degree of a confrontation relationship between the targets and the multi-robot system during tasks. The target, namely the escape robot, wants to be away from the multi-robot system, and the multi-robot system, namely the pursuit robot team, tries every means to capture the target at the same time. The multi-robot system cooperative pursuit strategy is the key to improving the task efficiency and success rates. If there is no cooperation, it will lead to low pursuit efficiency, long task completion time, and even task failure, which will threaten the security of the robots involved in the pursuit. Therefore, it is of great significance to conduct research on the multi-robot pursuit problem and to design a multi-robot cooperative pursuit strategy.
Currently, many scholars have researched the multi-robot pursuit problem. The multirobot pursuit problem includes two cases: multi-pursuit robots capture multi-escape robots (many-vs.-many) and multi-pursuit robots capture a single-escape robot(many-vs.-one). In the case of many-vs.-many, the task assignment should be done first to determine if all of the pursuit robot sets correspond to every escape robot, and the pursuit robots sets are then used to capture the corresponding escape robots. Thus, it can be seen that a case of manyvs.-many is transformed into a case of many-vs.-one based on the task assignment [9][10][11]. In addition, in the field of space, there is often only one pursuit robot that is used to capture the escape robot, such as the "Phoenix" and "Shenzhou 12 docking with Chinese space station" projects, etc. Additionally, the research focus is gradually expanded to capture an escape robot with many pursuit robots to reduce the control difficulty of the pursuit robot. A mature research system has not yet been formed for the pursuit problem of many-vs.-one, so this paper conducts research on that subject.
When the escape robot moves faster than the pursuit robots do, it is necessary to give full play to the advantages of the cooperative multi-robot system, which puts forward higher requirements for the design of a multi-robot cooperative pursuit strategy. To realize the pursuit of the escape robot with high speed, some scholars have researched the constraints of the initial position distribution of the pursuit robots and the number of the pursuit robots required to complete the pursuit tasks. Breakwell [12] completed the boundary analysis of two pursuit robots with slower speeds pursuing an escape robot with a faster speed based on a differential game and determined the relative positional relationship of the pursuit robots at the beginning of the pursuit task. Zha [13,14] applied the boundary analysis method to the pursuit constraint analysis of tasks with many pursuit robots and one escape robot. The minimum number of pursuit robots required and the speed relationship between the pursuit robots and the escape robot were further determined. Jin [15] also conducted a pursuit constraints analysis based on the Apollonius circle, which included the minimum number of pursuit robots required and the position distribution at the beginning of the pursuit task. All of the above studies are based on the constraints analysis of the number and initial position of pursuit robots. However, the positions of the pursuit robots are random at the beginning of the pursuit task, so the initial position constraints may not be satisfied, which may lead to task failure. Considering that the perception ranges of the robots are limited and considering that when the pursuit robots are outside the perception range of the escape robot, the process of pursuit and escape will not start immediately. Su [16] proposed a multi-robot cooperative pursuit strategy based on the Q-learning algorithm, where the pursuit robots are moved to a certain distance from the escape robot. However, this strategy does not analyze whether the positions of the pursuit robots meet the initial position constraints, which results in the pursuit task having a high probability of failure. The analysis of the above research shows that the key to improving the success rate of the pursuit task is to determine the initial pursuit conditions of the initial position and the number of pursuit robots. Therefore, this paper designs a multi-robot cooperative pursuit strategy that can adjust the positions of the pursuit robots to form a pursuit configuration that satisfies the pursuit conditions before the pursuit begins.
After the pursuit robots surround the escape robot with the pursuit configuration, the movement strategies of the pursuit robots will directly affect the pursuit result. In the previously conducted research on the movement strategies of pursuit robots, many scholars have adopted discrete methods to simplify the problem and to improve the efficiency. Benda [17] conducted a study with many pursuit robots and one escape robot based on a discrete grid environment, and each robot was restricted to move in the horizontal or vertical direction. On this basis, Korf [18] introduced movement along the diagonal to expand the robots' movement strategies. Zhou [19] regarded the pursuit process as a multi-stage game and established a game model to obtain the movement strategies of pursuit robots in the pursuit process. However, rasterization of the environment causes the robots' movement to be limited to the horizontal, vertical, and diagonal directions in these research studies. This causes a big difference between the research and the actual scenario. Skrzypczyk [20] discretized and combined the angular velocity and linear velocity of the robots to obtain the movement strategy of the robots so that the robots can move in more directions. This reduces the limitation influence of the rasterization method on the robots' movement strategy, thereby reducing the difference between the research and the actual scenario. This method used to conduct this research on the multi-robot target pursuit problem is based on non-cooperative game theory. However, the escape robot did not take the initiative to escape, which is not consistent with the actual situation.
In research previously conducted on multi-robot pursuit, the movement of the escape robot also needs to be considered. For this problem, Selvakumar [21] regarded the team of pursuit robots as a players in a game and the escape robot as the another player. They proposed a method using the game matrix to determine the movement strategy of the pursuit robots to achieve the pursuit of the active escape robot. However, there is always only one pursuit robot at a time in this study, and the remaining pursuit robots are not well used, resulting in a waste of resources. In order to improve the participation degree of various pursuit robots in the pursuit process, some experts give full play to the cooperation of the multi-robot system. Alexander [22] proposed multiple two-player game decomposition (MTPGD) by changing the player selection method in the game and by considering the pursuit problem as a combination of multiple zero-sum games. Among them, the encapsulated-team two-player game decomposition (ETTPGD) embodies the strict confrontation relationship between the pursuit and escape robot during thepursuit process. Both MTPGD and ETTPGD can complete the pursuit. However, they have not considered the cooperative relationship between the pursuit robots, which may easily lead to conflicts among the pursuit robots. Therefore, it is necessary to not only consider the competitive relationship between the pursuit robots and the escape robot, but also the cooperative relationship of the pursuit robots by analyzing the mutual influence of the pursuit robots in the pursuit process.
In summary, in order to improve the success rate of a pursuit task with a multi-robot system, this paper divides the multi-robot pursuit process into two stages. First, this paper adjusts the position distribution of the pursuit robots to form a pursuit configuration for the escape robot based on the pursuit conditions. Then, a movement strategy is proposed for the pursuit robots that is based on a zero-sum game to complete the pursuit task. The main contributions of this paper are: 1.
The problem of multi-robot cooperative pursuit: a multi-robot cooperative pursuit strategy including two stages and considering both pursuit robot position adjustment and the pursuit robots' pursuit of the eacape robots based on a zero-sum game is proposed to improve the success rate of the pursuit task; 2.
The cooperation of the pursuit robots and the competition between the pursuit robots and the escape robot are considered comprehensively to establish a zero-sum game model in this paper, which avoids conflict among the pursuit robots and improves the safety of the multi-robot system during the pursuit process.
The article structure is as follows. The second part describes the problem of multirobot pursuit. The third part establishes the zero-sum game model of the multi-robot pursuit problem and proposes a method to solve the game model. Then, a pursuit robot movement strategyis proposed in the game stage in order to pursue the escape robot. The fourth part completes the position adjustment of the pursuit robots based on the Hungarian algorithm and form a pursuit configuration for the escape robot. The overall design of the multi-robot cooperative pursuit strategy is completed in this part. In the fifth part, a simulation experiment for the four pursuit robots and an escape robot are designed to verify the effectiveness of the multi-robot cooperative pursuit strategy mentioned in this paper. Finally, the summary of this paper is presented.

The Description of the Multi-Robot Pursuit Problem
In the study of multi-robot cooperative pursuit, the form of the pursuit tasks will have an important impact on the design of the cooperative pursuit strategies. Therefore, the multi-robot pursuit task will be analyzed in this part to determine the movement of the pursuit robots during the pursuit process and to give a mathematical description of the multi-robot pursuit problem.
In the two-dimensional space, the schematic diagram of the multi-robot pursuit task is shown in Figure 1. Where the escape robot is recorded as E, the n pursuit robots are Machines 2021, 9, 187 4 of 18 recorded as P = {P 1 , P 2 , · · · , P n }. Combined with the polygon forming conditions, it can be seen that n ≥ 3 must be met to ensure that the pursuit robots can form an effective encirclement for the escape robot. Considering the operating safety of the robots, there are n circular areas that represent the safe operation area of each robot with the center of each pursuit robot's position and the radius of d safe . At the same time, the circle area at the center of the escape robot's position and the radius of d E is the safe area for the escape robot. The escape robot will not play games with pursuit robots outside of this area. During the pursuit process, the position of the pursuit robot team is recorded as P = [P 1 , P 2 , · · · , P n ] T . P i is the position of the i-th pursuit robot P i , and E is the position of escape robot. The distance between the i-th pursuit robot and the escape robot is d i (t). When Equation (1) is satisfied, the capture condition is met, and the pursuit task is completed.
where D min (P, E, t) is the minimum distance between the pursuit robots and the escape robot, D captured is the capture distance.
is shown in Figure 1. Where the escape robot is recorded as E , the n pursuit robots ar recorded as    n P P P P 1 2 , , , . Combined with the polygon forming conditions, it can b seen that  n 3 must be met to ensure that the pursuit robots can form an effective encir clement for the escape robot. Considering the operating safety of the robots, there are n circular areas that represent the safe operation area of each robot with the center of each pursuit robot's position and the radius of d safe . At the same time, the circle area at th center of the escape robot's position and the radius of E d is the safe area for the escap robot. The escape robot will not play games with pursuit robots outside of this area. Dur ing the pursuit process, the position of the pursuit robot team is recorded a , , , P P P P . i P is the position of the i-th pursuit robot i P , and E is the posi tion of escape robot. The distance between the i-th pursuit robot and the escape robot i   i dt. When Equation (1) is satisfied, the capture condition is met, and the pursuit task i completed.
is the minimum distance between the pursuit robots and the escap robot, D captured is the capture distance. Figure 1. The schematic diagram of the multi-robot pursuit task.
During the pursuit process, the movement of each pursuit robot is shown in Figur 2. Each robot moves at the maximum speed P V (the magnitude of the velocity), and th velocity of the i-th pursuit robot is denoted as i v . The velocity of the escape robot i denoted as E v , and the speed of it is E V . The position of the pursuit robot i P at the mo ment of t is recorded as ,  i n 1,2, , ,  t 0 . After a decision period T , the position of i P changes to , and it can be calculated by the following equation: During the pursuit process, the movement of each pursuit robot is shown in Figure 2. Each robot moves at the maximum speed V P (the magnitude of the velocity), and the velocity of the i-th pursuit robot is denoted as v i . The velocity of the escape robot is denoted as v E , and the speed of it is V E . The position of the pursuit robot P i at the moment of t is recorded as P i (t) = [x i (t), y i (t)], i = 1, 2, · · · , n,t ≥ 0. After a decision period T, the position of P i changes to P i (t + T) = [x i (t + T), y i (t + T)], and it can be calculated by the following equation: where θ i is the angle between the velocity direction of P i and the positive direction of the X axis at the moment of t. ϕ i is the velocity direction deflection angle of P i in the decision period T, and the counterclockwise deflection is specified as positive. Due to the limited deflection ability of the robots, ϕ i ∈ [−ϕ max , ϕ max ] needs to be met, and ϕ max (ϕ max > 0) is the maximum deflection angle. Since the movement description process of the escape robot E is similar to that of the pursuit robots, it will not be described in detail.
i i decision period T , and the counterclockwise deflection is specified as positive. Due to the limited deflection ability of the robots,         i max max , needs to be met, and    max max 0 is the maximum deflection angle. Since the movement description process of the escape robot E is similar to that of the pursuit robots, it will not be described in detail. Figure 2. The schematic diagram of the movement of i P .
Based on the above, the mathematical description of the multi-robot pursuit problem is as follows: . . , , min , , , where   S t represents the deflection angles of all of the robots at the moment of t , and k is the number of decisions required to complete the pursuit.

Movement Strategy Solving Method Based on Zero-Sum Game
After completing the mathematical description of the multi-robot pursuit problem, the multi-robot cooperative pursuit model is be conducted and solved based on a zerosum game in order to achieve the movement strategy of the pursuit robots in this part. The pursuit robot team    n P P P P 12 , , , is regarded as a player in the game, and the escape robot is regarded as another player. The cooperation of the pursuit robots and the competition between the pursuit robots and the escape robot are comprehensively considered to design the game payoff function to improve the zero-sum game model. Then, the optimal movement strategy for the pursuit robots during the pursuit process is designed by solving the game model.

Establishment of Multi-robot Pursuit Zero-Sum Game Model
Combined with the aforementioned robot movement method, the pursuit process of the multi-robot pursuit is decomposed into multiple decision-making stages. Each stage is regarded as a round of the game. The strategy set of players P and E should be determined first, which represents the possible movements of the pursuit robot team and the escape robot in a round of the game. In each round of the game, the construction process of the strategy set is as follows: Based on the above, the mathematical description of the multi-robot pursuit problem is as follows: where S(t) represents the deflection angles of all of the robots at the moment of t, and k is the number of decisions required to complete the pursuit.

Movement Strategy Solving Method Based on Zero-Sum Game
After completing the mathematical description of the multi-robot pursuit problem, the multi-robot cooperative pursuit model is be conducted and solved based on a zero-sum game in order to achieve the movement strategy of the pursuit robots in this part. The pursuit robot team P = {P 1 , P 2 , · · · , P n } is regarded as a player in the game, and the escape robot is regarded as another player. The cooperation of the pursuit robots and the competition between the pursuit robots and the escape robot are comprehensively considered to design the game payoff function to improve the zero-sum game model. Then, the optimal movement strategy for the pursuit robots during the pursuit process is designed by solving the game model.

Establishment of Multi-robot Pursuit Zero-Sum Game Model
Combined with the aforementioned robot movement method, the pursuit process of the multi-robot pursuit is decomposed into multiple decision-making stages. Each stage is regarded as a round of the game. The strategy set of players P and E should be determined first, which represents the possible movements of the pursuit robot team and the escape robot in a round of the game. In each round of the game, the construction process of the strategy set is as follows: The deflection angle ϕ i of the i-th pursuit robot P i can be discretized as shown in Figure 3. Then, the movement strategy of pursuit robots can be obtained: where ϕ max is the maximum deflection angle of the pursuit robots, and K is the number of the deflection angle discretization. ϕ j i represents the j-th deflection angle of P i , which is optional.
where  max is the maximum deflection angle of the pursuit robots, and K is the number of the deflection angle discretization.  j i represents the j-th deflection angle of i P , which is optional. Figure 3. Discretization of robot motion deflection angle.
The strategy set P S of the pursuit robots team can be obtained by combining the movement strategies of the pursuit robots. Similarly, the strategy set E S of the escape robot can be obtained: When the pursuit robots and escape robot adopt a certain strategy from their strategy set, the pursuit situation Payoff S S , of the escape robot can be designed to evaluate the strategy adopted by the two parties. Based on the zero-sum game, it can be known that the sum of the gains and losses of both parties involved in the game is always zero in the case of strict competition. That is, the gains of one party must mean the equal losses of the other party.
According to the characteristics of the zero-sum game, the payoff function of the pursuit robot team and the escape robot has the following relationship: It can be seen from Equation (6) that when one of the payoff functions is designed completely, another payoff function can be subsequently determined. The payoff function design of the pursuit robot team should take into account the cooperative relationship among the pursuit robots, while the payoff function of the escape robot needs to destroy this cooperative relationship. In the case that each robot operates safely, the pursuit robot team should maintain a compact state to enhance the cooperation capability and to gradually move towards the escape robot. The escape robot needs to break the compactness of the pursuit robot team and move away from the pursuit robots. Therefore, the payoff function of the pursuit robot team and the escape robot are designed as follows: 1. Pursuit robot team distribution and maintenance item P F 1 : The strategy set S P of the pursuit robots team can be obtained by combining the movement strategies of the pursuit robots. Similarly, the strategy set S E of the escape robot can be obtained: When the pursuit robots and escape robot adopt a certain strategy from their strategy set, the pursuit situation Ω(S P , S E ) is formed. Then, we need to design the payoff function, which is used to determine the optimal solution. Combined with the basic elements of the game, the payoff function Payo f f P (S P , S E ) of the pursuit robots team and the payoff function Payo f f E (S P , S E ) of the escape robot can be designed to evaluate the strategy adopted by the two parties. Based on the zero-sum game, it can be known that the sum of the gains and losses of both parties involved in the game is always zero in the case of strict competition. That is, the gains of one party must mean the equal losses of the other party. According to the characteristics of the zero-sum game, the payoff function of the pursuit robot team and the escape robot has the following relationship: It can be seen from Equation (6) that when one of the payoff functions is designed completely, another payoff function can be subsequently determined. The payoff function design of the pursuit robot team should take into account the cooperative relationship among the pursuit robots, while the payoff function of the escape robot needs to destroy this cooperative relationship. In the case that each robot operates safely, the pursuit robot team should maintain a compact state to enhance the cooperation capability and to gradually move towards the escape robot. The escape robot needs to break the compactness of the pursuit robot team and move away from the pursuit robots. Therefore, the payoff function of the pursuit robot team and the escape robot are designed as follows: 1.
Pursuit robot team distribution and maintenance item F 1 P : In order to ensure that the pursuit robot team maintains a compact distribution, we propose the pursuit robot team distribution and maintenance item F 1 P . It presents the compact degree of the pursuit robots' positions. Adjusting this item can make the pursuit robots distribute compactly, and it is not easy for one robot to remove itself too far away from the team.
In the inertial coordinate system, the virtual center C b team = x b C (t), y b C (t) of the pursuit robots team, can be represented as follows: where b = 0 represents the current pursuit situation, and b = 1 represents the new situation with the pursuit strategy.
Comprehensively consider the distance standard deviation σ b (P, E) between each pursuit robot and the escape robot and the distance d b C b team , E between the virtual center of the pursuit robot team and the escape robot to design a pursuit robot team distribution and maintenance item F 1 P .
where K A is a constant, and η 1 , η 2 are the weight coefficients, η 1 + η 2 = 1. When it is more focused on keeping the distance d i between the pursuit robots and the escape robot consistent, the η 1 is larger. When it is more focused on creating the distance d C b team , E between the virtual center of each pursuit robot and the escape robot, the η 1 is shorter, and the η 2 is larger. The coefficients of η 1 and η 2 are adjusted according to experience. The distribution of the pursuit robots can be kept compact by adjusting η 1 and η 2 . Meanwhile, the pursuit robots can be prevented from leaving the team to move alone, and the cooperative relationship between the pursuit robots can be enhanced during the pursuit process in this way.

2.
Pursuit distance item F 2 P : In order to make the pursuit robot team approach the escape robot gradually and to ensure that the escape robot gradually moves away from the pursuit robots, we propose the pursuit distance item F 2 P . It represents the distance between the pursuit robot team and the escape robot. By adjusting this item, the pursuit robot team can gradually approaches the escape robot. During the pursuit process, D b team (P, E) is the total distance between the pursuit robot team and the escapee.
is the closest distance between the pursuit robot team and the escape robot.
In combination with the above two distances, the pursuit distance item F 2 P can be designed as where K B is a constant, β 1 , β 2 are the weight coefficients, and β 1 + β 2 = 1. When the coefficient β 1 is larger, it ensures that the pursuit robots are more inclined to shorten the overall distance and realize the contraction of the surrounding points. When the coefficient β 2 is larger, it ensures that the pursuit robot team is more inclined to continuously control the closest pursuit robot to approach the escape robot. The coefficients β 1 and β 2 are adjusted according to experience.

3.
Robot collision avoidance item F 3 P : In order to improve the safety of the robots and to avoid a multi-robot collision, we propose the robot collision avoidance item F 3 P . It represents the collision status of each robot. By adjusting this item, the safe operation of the robots can be ensured, and the robots do not collide with each other ocer the course of the pursuit task. The robot collision avoidance item F 3 P is designed as follows. where d 1 ij (S ω P , S υ E ) represents the distance between any two robots after the pursuit robot team adopts the movement strategy S ω P and after the escape robot adopts the movement strategy S υ E . Based on the above sub-items, the payoff function expression is as follows: where K C is a constant, the weight coefficients τ 1 , τ 2 , τ 3 are adjusted according to experience, and τ 1 + τ 2 + τ 3 = 1. Equation (13) can be used to evaluate the possible strategy combination adopted by the pursuit robots and the escape robot. Then, a multi-robot team payoff matrix and the escape robot payoff matrix can be built with the evaluation results, as shown in Equations (14) and (15).
Among them, after the escape robot adopts the movement strategy S υ E and the pursuit robots adopt the movement strategy S ω P , the corresponding payoff function Payo f f p S ω P , S υ E is abbreviated as Payo f f ωυ P . K is the strategies number obtained by discretizing the movement strategy of the robots.
In this section, the zero-sum game model of the multi-robot pursuit problem is established based on the three basic elements of the game player, strategy sets, and payoff function.

The Solution Method of Optimal Game Movement Strategy
Based on the multi-robot pursuit zero-sum game model, in order to obtain the optimal movement strategy of the pursuit robots in each round of the game, this section solves the game model by solving the game payoff matrix.
In one round of the game, when the pursuit robots adopt the strategy S * P and the escape parties adopt the strategy S * E that satisfy Equation (16), the game reaches a Nash equilibrium state. At this time, strategy S * P and S * E are the pure strategy Nash equilibrium solution of the game. In the Nash equilibrium state, neither the pursuit robots nor the escape robot can unilaterally change their adioted strategy to make the situation more beneficial to itself. S * P and S * E are the optimal movement strategies that the pursuit robot team and the escape robot can adopt in this round of the game. That is, the movement strategy corresponding to the pure strategy Nash equilibrium solution is the current optimal movement strategy.
Combining the definition of the Nash equilibrium solution and the method in references [23,24] can solve the model. First, for each of the escape robot's escape strategies, the optimal pursuit strategy of the pursuit robots team to deal with the escape strategy is solved and forms a set T P .
where S υω indicates the strategy combination with the strategy S υ E that the escape robot adopts and the corresponding optimal strategy S ω P that the pursuit robots team adopts. Similarly, for each pursuit robot team movement strategy, the corresponding optimal movement strategy of the escape robot can be determined as shown in Equation (18) and can form a strategy set T E .
where S ωυ indicates the strategy combination with the strategy S ω P that the pursuit robot team adopts and the corresponding optimal escape strategy S υ E that the escape robot adopts. The pure strategy Nash equilibrium solution NE of the game can be obtained by solving the intersection of sets T P and T E . When there is only one solution for NE, the strategy combination for NE can be considered to be the optimal movement strategy adopted by the pursuit robot team and the escape robot in this round of the game.
In view of the possible situation that there is no solution or a multi-solution, the following processing is conducted to ensure the existence and uniqueness of the movement strategy of the pursuit robot team and the escape robot: In the case of T P ∩ T E = ∅, the pursuit robot team and the escape robot cannot obtain the optimal movement strategy through the pure strategy Nash equilibrium solution in this round of the game. Taking into account the confrontational relationship between the pursuit robot team and the eacape robot, the two sides should determine their respective movement strategies according to the principle of avoiding taking on the strategy that would have the most adverse effect on them. By determining the minimum element e = min(I P ) of the payoff matrix I P and by removing the row where the e is located, the most unfavorable situation caused by the movement strategy of the escape robot being adopted is avoided. If e is located in multiple rows, the second minimum elements of these rows are compared until only one row is determined.This operation should be repeated until there is only one row left in the payoff matrix, and the corresponding strategy is the movement strategy S * P of the pursuit robot team under the aforementioned principles. The movement strategy S * E of the escape robot can be determined using similar methods to deal with the columns of the payoff matrix I E in the current round round.
In the case of T P ∩ T E = NE and NE have multiple solutions, the optimal pursuit strategy is not unique. Taking into account the stability of the pursuit task, the shortest distance D 1 min (P, E) between the pursuit robots team and the escape robot after using the execution strategy S i * P (t), S i * E (t) is introduced as an additional evaluation criterion. The strategy with the smallest D 1 min (P, E) is selected as the optimal choice of movement strategy.
The pursuit robot and escape robot movement strategies can be obtained by the above method for each round.

Design of Multi-Robot Pursuit Strategy
Based on the establishment of the zero-sum game model and the design of the game model solution method, it is necessary to conduct research on the multi-robot pursuit strategy to improve the success rate of the pursuit task. Additionally, the success rate of the pursuit task can be improved by pre-forming the surrounding configuration of the escape robot.
It can be seen from the literature [15,16] that when the number of pursuit robots meets the condition V P /V E = λ ≥ sin(π/n) and when the pursuit robots are evenly distributed on the circle centered around the escape robot, the pursuit robots with a slower velocity can adopt appropriate strategies to pursue the escape robot with a higher velocity to increase the success rate of the pursuit task. At this time, the positional relationship between the pursuit robots and the escape robot is shown in Figure 4. In order to control each pursuit robot to move from the initial position to the target position at the same time and to improve the position adjustment efficiency, this paper adopts the Hungarian algorithm [23], which is widely used to solve small and medium scale assignment problems to complete the surround points allocation of pursuit robots.
In the process of surround points allocation, the task cost matrix  nn C is constructed using the distance i j f 1 2 ( , ) PP between the initial position of the pursuit robots to the target position. By solving the task cost matrix, the target position of each pursuit robot can be determined. Then, each pursuit robot can be controlled to complete the position adjustment before the start of the pursuit game and can form a surrounding configuration for the escaped robot.
After the surround points allocation and the position adjustment of the pursuit robots, combining the aforementioned establishment and solve method of the zero-sum game model, the multi-robot pursuit strategy can be constructed to realize the capture of the escape robot with the pursuit robot team. The detailed process is as follows: Step 1: Randomly generate the initial positions of the pursuit robots and the escape robot; Step 2: The initial position distribution constraints of the pursuit task are considered to select the target position of the pursuit robots to surround the escape robot and use the Hungarian algorithm to complete the target position allocation of each pursuit robot; Step 3: Each pursuit robot moves according to the result of the target position allocation and forms a surrounding configuration for the escape robot; Step 4: A zero-sum game model of the multi-robot pursuit problem is established; Step 5: The zero-sum game model is solved to obtain the movement strategies of the pursuit robots and the escape robot at each decision-making stage and to complete the pursuit process of the escape robot.
In summary, the overall process of the multi-robot cooperative pursuit strategy designed in this chapter is shown in Figure 5. Where P 1 i = [x i , y i ] and P 2 j = x j , y j (i = 1, 2, · · · , n, j = 1, 2, · · · , n) represent the initial position of P i and the target position after the escape robot is surrounded.
In order to control each pursuit robot to move from the initial position to the target position at the same time and to improve the position adjustment efficiency, this paper adopts the Hungarian algorithm [23], which is widely used to solve small and medium scale assignment problems to complete the surround points allocation of pursuit robots.
In the process of surround points allocation, the task cost matrix C n×n is constructed using the distance f (P 1 i , P 2 j ) between the initial position of the pursuit robots to the target position. By solving the task cost matrix, the target position of each pursuit robot can be determined. Then, each pursuit robot can be controlled to complete the position adjustment before the start of the pursuit game and can form a surrounding configuration for the escaped robot.
After the surround points allocation and the position adjustment of the pursuit robots, combining the aforementioned establishment and solve method of the zero-sum game model, the multi-robot pursuit strategy can be constructed to realize the capture of the escape robot with the pursuit robot team. The detailed process is as follows: Step 1: Randomly generate the initial positions of the pursuit robots and the escape robot; Step 2: The initial position distribution constraints of the pursuit task are considered to select the target position of the pursuit robots to surround the escape robot and use the Hungarian algorithm to complete the target position allocation of each pursuit robot; Step 3: Each pursuit robot moves according to the result of the target position allocation and forms a surrounding configuration for the escape robot; Step 4: A zero-sum game model of the multi-robot pursuit problem is established; Step 5: The zero-sum game model is solved to obtain the movement strategies of the pursuit robots and the escape robot at each decision-making stage and to complete the pursuit process of the escape robot.
In summary, the overall process of the multi-robot cooperative pursuit strategy designed in this chapter is shown in Figure 5.
So far, this paper has completed research regarding a multi-robot cooperative pursuit strategy. This strategy divides the process of multi-robot pursuit into two stages: pursuit robot position adjustment and the pursuit of the escape robot based on a zero-sum game. In the first stage, this paper realized the position transition of the pursuit robots from the randomly generated initial configuration to the surround configuration that satisfies the pursuit constraints. In the second stage, this paper designed a pursuit strategy to realize the capture of the escape robot based on a zero-sum game. Step = 1 Step > Kmax ?
Step So far, this paper has completed research regarding a multi-robot cooperative pu strategy. This strategy divides the process of multi-robot pursuit into two stages: pu robot position adjustment and the pursuit of the escape robot based on a zero-sum g In the first stage, this paper realized the position transition of the pursuit robots from randomly generated initial configuration to the surround configuration that satisfie pursuit constraints. In the second stage, this paper designed a pursuit strategy to re the capture of the escape robot based on a zero-sum game.

Results and Discussion
In this chapter, the correctness and effectiveness of the multi-robot cooperative suit strategy proposed in this paper will be simulated and verified. When the numb pursuit robots is equal to or greater than three, it is sufficient to form the pursuit c tions. The increase in the number has little effect on the success rate of the pursuit For the sake of simulation experiments without a loss of generality, there are four pu robots and an eacape robot that are used as experimental objects. When the speed o robots meets the condition      P E V V / sin π/4 , the simulation experiments are ducted for the three cases when the speed of the pursuit robots is greater than, equ and less than that of the escape robot. The simulation results verify the correctness effectiveness of the multi-robot cooperative pursuit strategy designed in this paper.

Verification of the Multi-Robot Game Pursuit Model
In order to verify the generality of the multi-robot game pursuit model, four pu robots with the speeds of 1.2 m/s, 1.0 m/s, and 0.8 m/s are used to pursue the escape with a speed of 1m/s in this section.
Before the game's pursuit process, the position E of the escape robot and the tion P of the pursuit robots located on the boundary of the escape robot safe are

Results and Discussion
In this chapter, the correctness and effectiveness of the multi-robot cooperative pursuit strategy proposed in this paper will be simulated and verified. When the number of pursuit robots is equal to or greater than three, it is sufficient to form the pursuit conditions. The increase in the number has little effect on the success rate of the pursuit task. For the sake of simulation experiments without a loss of generality, there are four pursuit robots and an eacape robot that are used as experimental objects. When the speed of the robots meets the condition V P /V E = λ ≥ sin(π/4), the simulation experiments are conducted for the three cases when the speed of the pursuit robots is greater than, equal to, and less than that of the escape robot. The simulation results verify the correctness and effectiveness of the multi-robot cooperative pursuit strategy designed in this paper.

Verification of the Multi-Robot Game Pursuit Model
In order to verify the generality of the multi-robot game pursuit model, four pursuit robots with the speeds of 1.2 m/s, 1.0 m/s, and 0.8 m/s are used to pursue the escape robot with a speed of 1 m/s in this section.
Before the game's pursuit process, the position E of the escape robot and the position P of the pursuit robots located on the boundary of the escape robot safe area are randomly generated. The radius d E of the escape robot safe area is set to 20 m. During the pursuit process, the range of motion deflection angle of each pursuit robot is set to ϕ ∈ [−π/3 π/3]rad, and the deflection angle ϕ is discretized with the interval π/18rad. The radius of the robots' safe operation area d sa f e is set to 1 m. The capture distance D capture is set to 3m. When the distance between each pursuit robot and the escape robot satisfies Equation (1), the pursuit task is completed successfully.
Each pursuit robot should be made to repeat the game pursuit process 30 times for the aforementioned three different speed situations; the success rate of the pursuit task is shown in Figure 6. π / 18rad . The radius of the robots' safe operation area safe d is set to 1m . The capture distance capture D is set to 3m . When the distance between each pursuit robot and the escape robot satisfies Equation (1), the pursuit task is completed successfully. Each pursuit robot should be made to repeat the game pursuit process 30 times for the aforementioned three different speed situations; the success rate of the pursuit task is shown in Figure 6. During the repeated game pursuit simulation experiment, the pursuit processes of the pursuit robots with different speeds were selected to be shown in Figure 7. During the repeated game pursuit simulation experiment, the pursuit processes of the pursuit robots with different speeds were selected to be shown in Figure 7.
π / 18rad . The radius of the robots' safe operation area safe d is set to 1m . The c distance capture D is set to 3m . When the distance between each pursuit robot and cape robot satisfies Equation (1), the pursuit task is completed successfully. Each pursuit robot should be made to repeat the game pursuit process 30 tim the aforementioned three different speed situations; the success rate of the pursuit shown in Figure 6. During the repeated game pursuit simulation experiment, the pursuit proce the pursuit robots with different speeds were selected to be shown in Figure 7.  Through the comprehensive analysis of the above simulation results, it can be seen that when the pursuit robots move faster than the escape robot, the pursuit robots can always capture the escape robot with their speed advantage In addition, from the trajectories of Pursuit Robots 2 and 3 in Figure 7a, we know that since the initial positions of Pursuit Robots 2 and 3 are very close, it is a dangerous situation for the pursuit robots. However, under the effect of the collision avoidance term of the payoff function, the pursuit robots adopted a collision-avoidance movement strategy at the initial stage of the pursuit, which increased the distance between Pursuit Robots 2 and 3 and avoided a collision between the pursuit robots. From the simulation results of the successful pursuit tasks in Figure 7b,d, we know that when the escape robot is faster, the pursuit robots maintain the surrounding state of the escape robot through cooperation and gradually shorten the shortest distance between the pursuit robots and the escape robot until the capture conditions are met and the escape robot is captured. It is verified that the multirobot game pursuit model established in this paper can fully exert the advantage of the cooperative relationship between the pursuit robots. From the simulation results of the failed pursuit tasks in Figure 7c,e, we know that an unreasonable position distribution of  Through the comprehensive analysis of the above simulation results, it can be seen that when the pursuit robots move faster than the escape robot, the pursuit robots can always capture the escape robot with their speed advantage In addition, from the trajectories of Pursuit Robots 2 and 3 in Figure 7a, we know that since the initial positions of Pursuit Robots 2 and 3 are very close, it is a dangerous situation for the pursuit robots. However, under the effect of the collision avoidance term of the payoff function, the pursuit robots adopted a collision-avoidance movement strategy at the initial stage of the pursuit, which increased the distance between Pursuit Robots 2 and 3 and avoided a collision between the pursuit robots. From the simulation results of the successful pursuit tasks in Figure 7b,d, we know that when the escape robot is faster, the pursuit robots maintain the surrounding state of the escape robot through cooperation and gradually shorten the shortest distance between the pursuit robots and the escape robot until the capture conditions are met and the escape robot is captured. It is verified that the multi-robot game pursuit model established in this paper can fully exert the advantage of the cooperative relationship between the pursuit robots. From the simulation results of the failed pursuit tasks in Figure 7c,e, we know that an unreasonable position distribution of the pursuit robots will cause the pursuit task to fail and result in a reduction of the pursuit success rate.

Validation of the Multi-Robot Cooperative Pursuit Strategy
According to the simulation in the above section, we know that the initial position distribution of the pursuit robots will directly affect the result of the multi-robot pursuit strategy. This section will verify the correctness and effectiveness of the multi-robot cooperative pursuit strategy with the position adjustment of pursuit robots.
In the e multi-robot cooperative pursuit process, the initial position E of the escape robot and the initial position P 1 of each pursuit robot are randomly generated. Suppose that the initial positions of the robots are shown in Equation (21).  (21) According to the multi-robot cooperative pursuit strategy designed in this paper, the Hungarian algorithm is first used to allocate the surrounding point of the escape robot, and the pursuit robots adjust their positions according to the allocation results to form the surrounding configuration of the escape robot. After surround points allocation, this paper adopts the A* path-planning algorithm to plan the movement trajectories and to calculate the distance between the initial position and the designated surround position of each pursuit robot. The selection and allocation results of the surrounding points of the escape robot based on the Hungarian algorithm are shown in Figure 8. According to the multi-robot cooperative pursuit strategy designed in this paper, the Hungarian algorithm is first used to allocate the surrounding point of the escape robot and the pursuit robots adjust their positions according to the allocation results to form the surrounding configuration of the escape robot. After surround points allocation, this pa per adopts the A* path-planning algorithm to plan the movement trajectories and to cal culate the distance between the initial position and the designated surround position o each pursuit robot. The selection and allocation results of the surrounding points of the escape robot based on the Hungarian algorithm are shown in Figure 8. In order to verify the effect position adjustment on the pursuit robots in this paper in improving the success rate of multi-robot cooperative pursuit, the pursuit robot position adjustment method proposed in [16] is used for comparison. The pursuit conditions only have constraints on the target positions of the pursuit robots, but they do not determine the target positions. In this position adjustment method, each pursuit robot moves from the initial position towards the escape robot until it reaches the escape robot's safe area boundary. When all of the pursuit robots move to the boundary of the escape robot's safe area, the robots start the game's pursuit process. The selection of the surrounding points of the escape robot based on the pursuit conditions is shown in Figure 9. In order to verify the effect position adjustment on the pursuit robots in this paper in improving the success rate of multi-robot cooperative pursuit, the pursuit robot position adjustment method proposed in [16] is used for comparison. The pursuit conditions only have constraints on the target positions of the pursuit robots, but they do not determine the target positions. In this position adjustment method, each pursuit robot moves from the initial position towards the escape robot until it reaches the escape robot's safe area boundary. When all of the pursuit robots move to the boundary of the escape robot's safe area, the robots start the game's pursuit process. The selection of the surrounding points of the escape robot based on the pursuit conditions is shown in Figure 9. In Figures 8 and 9, the red and blue circles are the initial and target positions for the pursuit robot position adjustment. The black lines are the trajectory of the pursuit robot position adjustment. The blue dot is the escape robot, and the dotted circle indicates the escape robot's safe area. It can be seen from Figures 8 and 9 that during the pursuit robot position adjustment, the trajectory of each pursuit robot is always outside of the escape In Figures 8 and 9, the red and blue circles are the initial and target positions for the pursuit robot position adjustment. The black lines are the trajectory of the pursuit robot position adjustment. The blue dot is the escape robot, and the dotted circle indicates the escape robot's safe area. It can be seen from Figures 8 and 9 that during the pursuit robot position adjustment, the trajectory of each pursuit robot is always outside of the escape robot's safe area, and the game pursuit process will not start.
After the pursuit robot position adjustment with the above two methods, a multi-robot pursuit zero-sum game model is established to obtain the movement strategies of the robots to complete the multi-robot pursuit task. The pursuit robots team have different speeds V P and pursue the escape robot based on the zero-sum game; the pursuit results are shown in Figure 10. In Figures 8 and 9, the red and blue circles are the initial and target positions for the pursuit robot position adjustment. The black lines are the trajectory of the pursuit robo position adjustment. The blue dot is the escape robot, and the dotted circle indicates the escape robot's safe area. It can be seen from Figures 8 and 9 that during the pursuit robo position adjustment, the trajectory of each pursuit robot is always outside of the escape robot's safe area, and the game pursuit process will not start.
After the pursuit robot position adjustment with the above two methods, a multi robot pursuit zero-sum game model is established to obtain the movement strategies o the robots to complete the multi-robot pursuit task. The pursuit robots team have differen speeds P V and pursue the escape robot based on the zero-sum game; the pursuit results are shown in Figure 10.  with the pursuit robot position adjustment method in [16]; (c) V p = V e with the pursuit robot position adjustment method in this paper; (d) V p = V e and the with pursuit robot position adjustment method in [16]; (e) V p > V e with the pursuit robot position adjustment method in this paper; (f) V p > V e with the pursuit robot position adjustment method in [16].
The multi-robot cooperative pursuit comparison experiment with the different pursuit robot position adjustment methods is repeated 30 times. Additionally, the success rate statistics are shown in Figure 11.  Figure 10a,c,e are the simulation results of the multi-robot cooperative pursuit strategy with different pursuit robto speeds and the same position adjustment method proposed in this paper. The figure shows that the pursuit task can be successfully completed by forming a surrounding configuration for the escape robot based on the Hungarian algorithm before the start of the pursuit game. Figure 10b,d,f are the simulation results of the multi-robot cooperative pursuit strategy, which adjusts the pursuit robot positions using the method in [16]. The pursuit tasks in Figure 10b,d failed because the pursuit robots did not form a surrounding configuration that satisfied the position distribution constraints for the escape robot. This pursuit robot position adjustment method leads to a reduction in the pursuit success rate. In Figure 10f, the pursuit duration is longer than in Figure 10e. This shows that the position adjustment method in [16] leads the pursuit efficiency of the pursuit task to be lower than the method in this paper. The repeated experiment results in Figure 11 show that the success rate of the pursuit tasks using the position adjustment method in this paper can be improved to % 100 . This is because when the pursuit robots adjust their positions based on the Hungarian algorithm, the escape robot is tightly surrounded by the pursuit robots. Additionally, even if the escape robot is faster than the pursuit robot, the pursuit robot can capture it pretty successfully. This futher proves that the multi-robot cooperative pursuit strategy in this paper can guarantee the capture of the escape robot and can effectively improve the success rate of multi-robot cooperative pursuit tasks.

Conclusions
Aiming to solve the problem of multi-robot pursuit, this paper proposed a multirobot cooperative pursuit strategy based on a zero-sum game and surrounding points adjustment. First, this paper describes the problem of multi-robot pursuit mathematically and abstracts the actual problem into a theoretical model. Second, this paper discretizes the multi-robot cooperative pursuit process and establishes a zero-sum game model for each decision-making stage to obtain the movement strategies of the pursuit and escape robot. Third, this paper controls the pursuit robots to form a surround configuration that satisfies the initial position distribution constraints of the pursuit task based on the Hungarian algorithm and designs a multi-robot cooperative pursuit strategy that divides the multi-robot pursuit task into two stages: pursuit robot position adjustment and game pur- Pursuit speed is greater than escape speed Pursuit speed equals escape speed Pursuit speed is less than escape speed With the method in [16] With Hungarian algorithm Figure 11. Statistical table of the pursuit results of the different pursuit strategies. Figure 10a,c,e are the simulation results of the multi-robot cooperative pursuit strategy with different pursuit robto speeds and the same position adjustment method proposed in this paper. The figure shows that the pursuit task can be successfully completed by forming a surrounding configuration for the escape robot based on the Hungarian algorithm before the start of the pursuit game. Figure 10b,d,f are the simulation results of the multi-robot cooperative pursuit strategy, which adjusts the pursuit robot positions using the method in [16]. The pursuit tasks in Figure 10b,d failed because the pursuit robots did not form a surrounding configuration that satisfied the position distribution constraints for the escape robot. This pursuit robot position adjustment method leads to a reduction in the pursuit success rate. In Figure 10f, the pursuit duration is longer than in Figure 10e. This shows that the position adjustment method in [16] leads the pursuit efficiency of the pursuit task to be lower than the method in this paper. The repeated experiment results in Figure 11 show that the success rate of the pursuit tasks using the position adjustment method in this paper can be improved to 100%. This is because when the pursuit robots adjust their positions based on the Hungarian algorithm, the escape robot is tightly surrounded by the pursuit robots. Additionally, even if the escape robot is faster than the pursuit robot, the pursuit robot can capture it pretty successfully. This futher proves that the multi-robot cooperative pursuit strategy in this paper can guarantee the capture of the escape robot and can effectively improve the success rate of multi-robot cooperative pursuit tasks.

Conclusions
Aiming to solve the problem of multi-robot pursuit, this paper proposed a multi-robot cooperative pursuit strategy based on a zero-sum game and surrounding points adjustment. First, this paper describes the problem of multi-robot pursuit mathematically and abstracts the actual problem into a theoretical model. Second, this paper discretizes the multi-robot cooperative pursuit process and establishes a zero-sum game model for each decisionmaking stage to obtain the movement strategies of the pursuit and escape robot. Third, this paper controls the pursuit robots to form a surround configuration that satisfies the initial position distribution constraints of the pursuit task based on the Hungarian algorithm and designs a multi-robot cooperative pursuit strategy that divides the multi-robot pursuit task into two stages: pursuit robot position adjustment and game pursuit. Finally, the simulation shows that the multi-robot cooperative pursuit strategy proposed in this paper can realize the capture of the escape robot by means of the cooperation of pursuit robots and can effectively improve the pursuit success rate. The multi-robot cooperative pursuit strategy proposed in this paper mainly has the following innovations:

1.
A multi-robot cooperative pursuit strategy was designed, and the pursuit task was divided into two stages: pursuit robot position adjustment and game pursuit, which improved the success rate of multi-robot cooperative pursuit tasks; 2.
The game model of the multi-robot cooperative pursuit tasks was optimized based on a zero-sum game, which comprehensively considered the cooperative relationship between the pursuit robots and the confrontation relationship between the pursuit robot team and the escape robot in the multi-robot cooperative pursuit process. Three pursuit robots team distribution payoff functions and a maintenance item, a pursuit distance item, and a robot collision avoidance item were constructed to give full weight to the advantages of the multi-robot cooperation and to ensure that the multi-robot cooperative pursuit task was completed based on the safe operation of the robots.
The multi-robot cooperative pursuit strategy proposed in this paper is versatile. In follow-up research, the many-to-one confrontation problem of multiple aircraft pursuit can be considered based on the strategy proposed in this paper. In addition, in the position adjustment stage for the pursuit robots, the movement state of the escape robot is assumed to be stationary and was not considered adequately in this paper. This is not in line with a realistic scenario, and we will pay attention to this problem in follow-up research.