Optimal Control Problem Solution with Phase Constraints for Group of Robots by Pontryagin Maximum Principle and Evolutionary Algorithm

: A numerical method based on the Pontryagin maximum principle for solving an optimal control problem with static and dynamic phase constraints for a group of objects is considered. Dynamic phase constraints are introduced to avoid collisions between objects. Phase constraints are included in the functional in the form of smooth penalty functions. Additional parameters for special control modes and the terminal time of the control process were introduced. The search for additional parameters and the initial conditions for the conjugate variables was performed by the modiﬁed self-organizing migrating algorithm. An example of using this approach to solve the optimal control problem for the oncoming movement of two mobile robots is given. Simulation and comparison with direct approach showed that the problem is multimodal, and it approves application of the evolutionary algorithm for its solution.


Introduction
The optimal control belongs to complex computational problems for which there are no universal solution algorithms. The most well-known result in this area [1] transforms the optimization problem into a boundary-value problem, and the dimension of the problem doubles. The goal of solving the boundary-value problem is to find the initial conditions for conjugate variables such that the vector of state variables falls into a given terminal condition. In general, for this problem, there is no guarantee that the functional for the boundary-value problem is not unimodal and convex on the space of initial conditions of conjugate variables.
The optimal control problem with phase constraints is considered. Phase constraints are included in the functional, so they are included in the system of equations for conjugate variables. This greatly complicates the analysis of the problem on the convexity and unimodality of the target functional. The accurate solution of optimal control problem has to use additional functions and regularization of equations at the search of control [2,3]. An additional problem in solving a boundary-value problem is determination of time for checking the fulfillment of boundary conditions. In this paper, for the numerical solution of the problem, it is proposed to use evolutionary algorithms that have shown efficiency in solving optimal control problems [4]. SOMA is a universal algorithm for various difficult optimization problems [5,6]. However, our attempt to apply SOMA to the optimal control problem of four robots with constraints has failed to find a good solution for any values of the algorithm parameters. We supposed that the modification of each possible solution in population in the process of evolution using the best current possible solution is not enough [7]. We expanded the modification of SOMA by introducing the best historical solution among randomly selected ones for each possible solution in the population.
The article consists of an introduction and eight sections. Statement of the optimal control problem with phase constraints is presented in Section 2. Section 3 contains Pontryagin maximum principle as one of main approaches for its numerical solution. Section 4 contains a description of one of the evolutionary algorithms, modified SOMA. An example is given in Section 5. The computational experiment and results are presented in Section 6. Section 7 describes the search of optimal control by direct method. Alternative non-deterministic control methods are observed in Section 8. Results and future research directions are discussed in Section 9.

Optimal Control Problem with Phase Constraints for Group of Robots
Consider the problem of optimal control for a group of robots with phase constraints. Given a mathematical model of control objects in the form of the system of ordinary differential equationṡ where x j is a state space vector of control object j, x j ∈ R n j , u j is a control vector of object j, u j ∈ U j ⊆ R m j , U j is a compact limited set, m j n j , j = 1, . . . , M, M is a number of objects. For the system (1) initial conditions are given x j (0) = x j,0 ∈ R n j , j = 1, . . . , M.
Given terminal conditions where t f ,j is an unknown limited positive value, that corresponds to time when object j achieves its terminal position t + is a given time of achievement of terminal conditions (3), ε 1 is a small positive value, j = 1, . . . , M. The phase constraints are given The conditions of collision avoidance are described as The quality functional is given in general integral form where It is necessary to find control as a time function in order to provide terminal conditions (3) with optimal value of functional (8) without violation of constraints (6) and with collision avoidance (7). For a numerical solution of the problem, let us insert phase constraints and terminal conditions in quality functional (8) where a, b, c are given positive weight coefficients, µ(A) = max{0, A}.
To solve the problem stated above, we use the Pontryagin maximum principle.

The Pontryagin Maximum Principle
The Pontryagin maximum principle allows one to transform the problem of optimization on infinite dimensional space to the boundary-value problem for the system of differential Equations (1). Let us construct Hamilton function for this problem on the base of the system (1) and the quality functional (2) without terminal conditions where ψ = [ψ 1 . . . ψ n ] T is a vector of conjugate variables, n = n 1 + . . . + n M , According to the Pontryagin maximum principle, a necessary condition for optimal control is the maximum of Hamilton function (12) max Pontryagin maximum principle allows one to transform the optimal control problem to a boundary-value problem. It is necessary to find the initial values of conjugate variables so that the state vector reaches terminal conditions (3). To solve the boundary-value problem, we have to solve a finite dimensional problem of nonlinear programming with the following functional where q = [q 1 . . . q n ] T , Q is a limited compact set, In a boundary-value problem, it is not known exactly when it is necessary to check the boundary conditions (15). The maximum principle does not provide equations for definition of terminal time t f ,j of the control process, while a numerical search of some possible solutions may not reach the terminal condition. To avoid this problem, let us add parameter q n+1 for the time limit of reaching the terminal state. As a result, the goal functional for the boundary-value problem is the following are low and up limitations of the parameter q n+1 . During the search process, we can decrease time t + according to sign of parameter q n+1 . If found parameter q n+1 is less than zero, then t + is decreased and the interval for values of parameter [q − n+1 ; q + n+1 ] is also narrowed.

Evolutionary Algorithm
The boundary-value problem may have a nonconvex and nonunimodal objective functional (15) on the parameter space q, therefore, to solve this problem, it is advisable to use an evolutionary algorithm.
Evolutionary algorithms differ in the form of changing possible solutions. The first evolutionary algorithms appeared at the end of the XX century and continue to appear. Currently, hundreds of evolutionary algorithms are known. Most of them are named after animals, although the connection between animal behavior and computational algorithms is not strictly proven anywhere and is determined only by the statement of the author of the algorithm. Common steps of evolutionary algorithms are: generation of a set of possible solutions, assessment of solutions by objective function to find one or more best solutions, modification of solutions in accordance with the value of its objective function and with information about the values of the objective functions of other solutions by evolutionary operators.
In this work, we investigate the application of the Pontryagin maximum principle to solve the optimal control problem for a group of robots with phase constraints, and do not compare evolutionary algorithms. We applied one of the effective evolutionary algorithms, self-organizing migrating algorithm (SOMA) [5,6], with modification [7] to find the parameters, i.e., initial conditions of conjugate variables and additional parameter q n+1 for terminal time. The modified SOMA includes the following steps. Generate a population of H possible solutions, taking into account where H is a cardinal number of the population set, ξ is a random number from 0 to 1. Normalize the first n possible solutions according to (16) In the optimization problem, we have to find a vector of optimal parameters q = [q 1 . . . q n+1 ] T in order to receive the minimal value of functional For each vector of parameters, we set a historical vector. Initially, historical vectors contain zero elements Calculate the values of functional for each possible solution Find the best possible solution q j 0 on a stage of evolution For each historical vector, find the best vector among the randomly selected onesq j in current population where j i ∈ {1, . . . , H}, i = 1, . . . , K. Transform each historical vector where i = 1, . . . , n + 1, j = 1, . . . , H, α and β are parameters of the algorithm, positive numbers less than one. Let us set a step t = δ. Calculate some new values for each possible solution where i = 1, . . . , n + 1, P rt is a parameter of the algorithm. Check each component of a new vector for restrictions Calculate the functional for a new vector If f j (t) f j , then we change possible solution q j by a new vector If t < P length then repeat calculations (25)-(32), P length is a parameter of the algorithm. Repeat calculations (22)-(32) for all possible solutions in the population. Then again, find the best solution (23) and change historical vector (25). Repeat all stages R times. The last best vector is a solution of the optimization problem.
An applied algorithm with historical vector is called a modified SOMA. The value of parameter β = 0 transforms the algorithm from modified SOMA to classical SOMA. Pseudo code of the modified SOMA has the following form, see Algorithm 1.  if f j l < f j * then j * = j l //* end if //* end for //* for (i = 1, . . . , n + 1) //* transformation of historical vectors q j i = αq In pseudo code, subroutine Random generates a random real number from 0 to 1, and subroutine Random(A) generates random integer number from 0 to A − 1. We used * in comments to highlight the modification of modified SOMA in comparison to original SOMA.
The effectiveness of modified SOMA, as with all evolutionary algorithms, depends on the parameters that influence the number of computational operations, i.e., number of elements in initial population (H), number of generations (R), number of evolutions (P). To evaluate one single solution, we need to simulate the whole system, thus, for the problem, we have to calculate the functional minimum H + nRP times, where n depends on parameter of algorithm P length .
As for all evolutionary algorithms, the convergence of modified SOMA is determined by probability. The more solutions are looked through, the more is the probability to find the optimal one. In evolutionary algorithms, the value of goal function depends on the number of generations as descending exponent. If the solution is not improved for some generations, then the search is stopped, and the best current solution is considered to be the solution to the problem. The optimal control problem with phase constraints is not unimodal, and the search algorithm is not deterministic, thus, to find the solution, the algorithm ran multiple times.

An Example
Consider a control problem for two similar mobile robots. Mathematical model of control objects has the following formẋ where j = 1, 2.
The control is limited where j = 1, 2, u − i , u + i are the given constraints, i = 1, 2. For the system (33) , the initial conditions are The terminal conditions are The static phase constraints are where j = 1, 2, r i , x * i , y * i are the given parameters of constraints, i = 1, . . . , r, r is a number of static phase constraints. For two robots, we have only one dynamic phase constraint where d is a given minimal distance between robots. A quality functional is where To obtain equations for conjugate variables, all constraints are included in the quality criterion, and terminal conditions are excluded as follows Suggesting that the problem is not abnormal, let us write down the Hamilton function in the following form As a result, the differential equations for conjugate variables arė where j = 1, 2. Optimal control is calculated from equations where The nonlinear programming problem consists of finding initial conditions for conjugate variables so that initial conditions have to allocate on a sphere with a unit radius as well as terminal time and special control modes where ε 0 is a small positive value. A goal function for the nonlinear programming has the following form
The trajectories of the robots are shown in Figure 1. On Figure 1, red circles present the static constraints. Plots of obtained control are presented on Figures 2-5. Figures 2-5 show that optimal control includes sectors of special control modes [8]. The controls u 1 2 and u 2 1 have sliding modes.

Search of Optimal Control by Direct Method
The same problem was solved by direct numerical method. Control for each robot was searched as a piece-wise liner function on interval as follows i = 1, . . . , K, t ∈ [i∆t; (i + 1)∆t), ∆t = 0.25, Eleven time intervals were used. For each control, it was necessary to find 11 parameters of piece-wise linear function at the boundaries of intervals. Totally, we searched for forty-five parameters q = [q 1 . . . q 45 ] T , forty-four parameters were for control of two robots, and q 45 was for terminal time t + + q 45 . Values of parameters were constrained where For direct approach, when we searched for 45 parameters by modified SOMA, the complexity of the algorithm was the following: H = 32, P = 1024, R = H, P length = 8, step = 0.22, n = 8/0.22 = 36, H + nRP = 32 + 36 × 32 × 1024 =1,179,680. Simulation was performed on PC with Intel Core i7, 2.8 GHz. A series of 10 runs was implemented. The CPU time for 10 runs was approx. 3 hours and 10 min, i.e., 1 run was approx. 19 min. We used P = 1024 for direct approach, because the number of searched parameters was 45, and it was several times bigger than 11 parameters in the first experiment.
The obtained solution is presented on Figure 6. On Figures 7-10, the plots of direct controls are presented.

An Alternative Non-Deterministic Control
One of the most important issues for swarm robotics applications is catching up with moving targets and avoiding multiple dynamic obstacles. It is complicated in that it requires an algorithm to work in real time to avoid obstacles that are standing or moving in an unknown environment, where the robot does not know their position until detecting them by sensors arranged on the robot. As an alternative to the method presented above, the use of swarm intelligence algorithms as reported in [9] can be discussed. The paper [9] presents a method for swarm robot to catch the moving target and to avoid multiple dynamic obstacles in the unknown environment. An imaginary map is built, representing N targets, M obstacles and N robots and a swarm intelligence algorithm is used to control them so that targets are captured correctly and in the shortest time. The robot dynamics can be viewed as a flow of water moving from high to low. The flow of water is the robot trajectory that is divided into a set of points created by an algorithm called SOMA [5,6]. Simulation results are also presented to show that the obstacle avoidance and catching target task can be reached using this method. All details about those experiments are discussed in [9]. Results are also visualized in selected videos (https://zelinkaivan65.wixsite.com/ivanzelinka/videa). The typical example is in Figure 11. Besides the long-standing methods such as potential field method [10,11], and the vector field histogram [12], several new methods such as "follow the gap method" [13], and barrier function [14], or artificial intelligence methods such as genetic algorithm [15], neural network [16], and fuzzy logic [17,18] also demonstrate their effectiveness. Among the methods of artificial intelligence used to solve the problem as a function optimization problem, the self-organizing migrating algorithm (SOMA) emerges as a fast, powerful and efficient algorithm [5,6].

Discussion
The optimal control problem for two mobile robots with phase constraints was considered. To solve the problem, an approach based on the Pontryagin maximum principle was used. The mathematical model of robots include linear control in the right parts of differential equations; that is why the optimal control has sectors of special control modes. It should be noted that we used two robots to test the proposed technology and partially to test methodology. A larger group is required to fully test the proposed methodology and it will be our future research, but in the case of many robots, the optimization problem will go on backstage and the collision avoidance will become the real problem.
To solve a boundary-value problem and search of initial conditions of conjugate variables, the modified SOMA was used. Additional parameters for terminal conditions check and control in special modes were introduced.
The optimal control problem was also solved by direct approach. Control time was divided into intervals, and control at each interval was a piece-wise linear function. Additional parameter was also a time of terminal conditions check. The direct approach showed another character of objects movement.
The aim of this paper was to show for the first time how modern evolutionary algorithms can be applied to solution of boundary-value problems that occur when we solve the optimal control problem by an indirect method based on the Pontryagin maximum principle. Other known applications of evolutionary algorithms were mainly with direct approach [4].
Thus, one can conclude that the considered problem is multimodal and application of evolutionary algorithms to both direct and indirect approaches is expedient and prospective. The next research will be focused on an extensive comparative study of classical and swarm control based methods.

Conflicts of Interest:
The authors declare no conflict of interest.