Particle Swarm Optimization—An Adaptation for the Control of Robotic Swarms

: Particle Swarm Optimization (PSO) is a numerical optimization technique based on the motion of virtual particles within a multidimensional space. The particles explore the space in an attempt to ﬁnd minima or maxima to the optimization problem. The motion of the particles is linked, and the overall behavior of the particle swarm is controlled by several parameters. PSO has been proposed as a control strategy for physical swarms of robots that are localizing a source; the robots are analogous to the virtual particles. However, previous attempts to achieve this have shown that there are inherent problems. This paper addresses these problems by introducing a modiﬁed version of PSO, as well as introducing new guidelines for parameter selection. The proposed algorithm links the parameters to the velocity and acceleration of each robot, and demonstrates obstacle avoidance. Simulation results from both MATLAB and Gazebo show close agreement and demonstrate that the proposed algorithm is capable of effective control of a robotic swarm and obstacle avoidance.


Introduction
Particle Swarm Optimization (PSO) [1] is a popular swarm intelligence algorithm that is used to minimize a cost function (or maximize a fitness function) in a multidimensional space. PSO uses multiple particles, with the velocity of each particle updated based on costs evaluated and shared by the entire swarm. PSO has been used successfully in many different tasks, including artificial neural network training, scheduling problems, and calibration problems [2][3][4][5]. Optimization commonly takes place in a synthetic environment, where virtual particles are allowed to roam without any physical constraints.
The PSO algorithm itself has multiple parameters, and many studies have provided design guidelines for selecting these parameters to ensure both stability and rapid convergence [6][7][8][9][10]. These studies have made several assumptions. The non-stagnant distribution assumption model is so far the closest to completely describing PSO, and the study employing it was able to prove order-1 and order-2 stability of the algorithm [10]. These orders of stability show that over time, both the expected position of a particle and its variance converge to a constant. All the existing analyses study the evolution of the swarm as a whole, and there are no studies of the short-term behavior of each particle.

PSO in Swarm Robotics
Swarm Robotics is the study of how a swarm of small, simple, and usually identical robots can be used to perform tasks that are not readily performed by a single robot. A swarm robotic system can be characterized by its robustness, flexibility and scalability [11,12]. Swarm robotic systems can be used for many different tasks (e.g., spatial organization, collective motion and decision making [12]). The particle swarms in PSO are analogous to a physical robotic swarm. Therefore, it comes as no surprise that PSO has been proposed as a control strategy for swarm robotic systems [13]. Indeed, several modified These problems have prevented PSO from being more widely used in swarm robotics. The first problems that need to be addressed are Problems 1 and 2 since they prevent PSO from being used in swarm robotics in general (i.e., they are not limited to source localization). In this paper necessary changes are introduced to the original PSO algorithm to solve these problems. Furthermore, a formalized parameter selection technique is presented that ensures order-1 and order-2 stability of the system, while accommodating the physical constraints (velocity and acceleration limits) of the individual robots. This leads to a new form of PSO with dynamic velocity control and obstacle avoidance that outperforms existing methods. Section 2 provides an introduction to PSO. Section 3 modifies the PSO algorithm for use in swarm robotic systems and develops formalized parameter selection methods. A simulation environment and working example is described in Section 4 and results are given in Section 5.

Particle Swarm Optimization Theory
The fundamental aim of PSO is to identify the location inside a multidimensional space that minimizes a cost function by using a swarm of particles. It is also possible to maximize the function, but from this point onwards the assumption will be minimization. At each timestep, every particle calculates the cost of its current location. Each particle stores the location of its lowest cost (personal best location) and communicates it to the rest of the swarm. Therefore, every particle also knows the location with lowest cost of the whole swarm (global best location). After computing and communicating the personal and global best locations each particle updates its velocity based on these values.
Let f : R d → R be the cost function that needs to be minimized, where d is the number of dimensions. Let the particle swarm Ω[k] be a set of N particles located in R d , at timestep [k]. Each particle computes the cost of its current location using f . The movement of the particles is then determined by the displacement update equation and the velocity update equation where u i [k] and x i [k] are the velocity and displacement of each particle i at timestep [k], respectively. The • operator represents element-wise multiplication (i.e., the Schur product). Each element of the vectors r 1 and r 2 is drawn from the uniform distribution, The location y i is the personal best location for particle i, such that for any timestep Similarly, y g is the global best location, such that for any particle i ∈ [1 : N] The parameters ω, c 1 and c 2 are the inertia weight, the cognitive coefficient and the social coefficient respectively [10]. The inertia weight ω prevents the particles of the swarm from diverging uncontrollably from either the personal or global best locations. Larger values of ω are perceived as allowing more exploration and expansion of the swarm in the problem space. Lower values of ω are instead used to enable rapid convergence. The cognitive coefficient c 1 and social coefficient c 2 control the rate of convergence towards a personal best (y) or global best (y g ) location. When c 1 > c 2 , each particle will favor moving towards its personal best location, and when c 1 < c 2 , the global best location [18][19][20].
To simplify further analysis, and in line with assumptions made in previous analyses, y g will be drawn from a distribution with well-defined mean µ and variance σ [9,10]. To follow these previous analyses further, only the behavior of a single particle will be considered, and so the index i will be omitted.

Parameter Tuning
PSO stability analyses typically aim to guarantee order-1 and order-2 stability [10], although higher orders of stability can be also studied [21]. Order-1 is guaranteed by whereĉ is the behavior coefficient and is given bŷ Order-2 is guaranteed by These safe operating regions are visualized in Figure 1. The limits ensure convergence towards either of the currently known personal and global best locations [10].

Introduction to RPSO
Robotic Particle Swarm Optimization (RPSO) is a popular approach that incorporates obstacle avoidance into PSO using an additional acceleration term [14]. Therefore, the velocity update equation becomes where c 3 is the obstacle susceptibility coefficient and each element of r 3 is drawn from the uniform distribution U(0, 1). The vector F t is a virtual force used to push the robot away from surrounding obstacles and it is given by where P is the total number of obstacles around the robot and F p is a virtual repulsive force exerted by obstacle p ∈ [1 : P]. Virtual forces are a well-studied feature in swarm robotics, used to control the position of a robot relative to its direct surroundings (e.g., obstacles or other robots). They are part of the more general Virtual Physics-based Design concept. A number of papers discuss virtual forces and offer different ways of how they can be calculated, ranging from simple position-dependent functions to more complex functions that are typically inspired from real-world physical forces (e.g., spring-dumper systems) [22][23][24][25]. In this paper we make use of simple position-dependent virtual forces to achieve aggregation, but more complex functions should be equally applicable. To define F p , let s be a vector of distances between the robot and surrounding obstacles, such that at s p = 0 the robot has collided with obstacle p. Therefore, a virtual repulsive force F p is given by where x r is the position of the center of the robot, x p is the position of the center of obstacle p. The parameter γ > 0 is the sensitivity parameter and adjusts the intensity of the repulsive force. The value of c 3 relative to c 1 and c 2 greatly affects the performance of the algorithm [14]. When c 3 max(c 1 , c 2 ), the effect of the last term of (9) may not be sufficient to force the robot to avoid an obstacle. On the other hand, when c 3 max(c 1 , c 2 ), the robot becomes overly sensitive to obstacles. Furthermore, all three acceleration terms are unbounded and can increase arbitrarily, leading to further collisions.

Dynamic Velocity Control
Traditionally PSO parameters are tuned before running the algorithm and they remain constant throughout the operation of the algorithm. Couceiro et al. proposed that the value of c 3 could be dynamically recalibrated [14]-a form of Dynamic Velocity Control (DVC). When no obstacles are present, c 3 tends to 0, but it grows the closer the robot is to an obstacle. However, even if this strategy is employed the unbounded nature of the acceleration terms in (9) can still result in collisions.
For the strategy to work, it must be possible to calculate the maximum velocity that a robot can reach without colliding with an obstacle, and then ensure that this velocity is never exceeded by the RPSO controller. Fortunately, this requirement aligns with the more general problems given in Section 1.1, and they can be solved together by relating the RPSO parameters to the maximum velocity of the robot.

Adaptation of PSO for Swarm Robotics
To formalize the relationship between the PSO parameters and the velocity (and acceleration) of the robot it is necessary to re-state the governing equations. First, a modified form of the update Equation (1) is required, where ∆t represents the discrete timestep (and thus 1/∆t the update rate) of the PSO controller, which represents the time taken to evaluate a new displacement and velocity vector. Delays caused by inter-robot communication and processing of sensor input will lead to larger values of ∆t. The robot may employ other low-level local controllers for tasks that require a higher refresh rate (e.g., motor controllers, data collection and data fusion controllers etc.
The sgn function, which is given by is also taken from the work of Kennedy (where its use is implied), and limits the maximum acceleration of a single robot by constraining each term of the velocity update equation. Please note that the sgn function is not a smooth function and may result in chattering [26] under certain conditions. A smooth function can be used instead (e.g., tanh or some type of a logistic function with outputs in the range (−1,1)) to avoid this. This paper considers the sgn function for simplicity, but all the results of the following analysis can be applied to the aforementioned smooth functions as well.

Updated Parameter Tuning Stability Criteria
The stability criteria of (6) and (8) must now be redefined for (13) to ensure stability. Using any of the stability analysis methods mentioned (i.e., [9,10]), the criteria for both order-1 and order-2 stability now become However, using negative values of ω would result in the velocity of the robot changing direction at every timestep. Similarly, using negative values forĉ would drive the robots away from the personal and global best locations. Therefore, the criteria of (15) become

Control of Velocity and Acceleration
An analysis of how the inertia weight ω and the cognitive coefficientĉ impact on the velocity and acceleration of the robot is now possible. This analysis will consist of three stages: First a state model will be derived in matrix form that describes a robot at each timestep, secondly the state model will be decomposed to understand how the state changes from one timestep to another, and finally expressions for the maximum velocity and acceleration of the robot will be derived in terms of ω andĉ.
Equation (13) can be simplified based on the stability study of Trelea [7] so that whereŷ[k] is the weighted average of y[k] and y g [k] as shown beloŵ The vectorr is a random vector of which each component r j is drawn from the uniform distribution such thatr

State Model
In control theory, the state of a robot may be described by its position and velocity at a specific timestep. The state vector z[k], which describes the state of motion in each of the d dimensions for a single robot, is given by where z j [k] describes the state of motion in only a single dimension j such that where x j [k] and u j [k] are the j th components of position (x) and velocity (u). PSO algorithms have no interdependency between dimensions, therefore it is possible to use the singledimension state vector z j [k] as a general description of every dimension of z[k]. Therefore, Equations (12) and (17) can be written in matrix form for each dimension j as where the right-hand-side consists of a deterministic term Mz j [k] and a stochastic term b j [k], such that This is a second order system in terms of position and velocity. However, as the objective is to understand how ω andĉ will affect velocity and acceleration, acceleration will be included as a state variable. By differentiating (17), the acceleration is given by The positional stability of PSO has been proven previously, and both the velocity and the acceleration are linearly independent of the position [9,10]. Therefore, for the sake of simplification position will be removed from the state. Thus, a new single-dimension state vectorẑ is defined asẑ Using (17) and (21), the new state model is given bŷ where the right-hand-side consists of a deterministic termMẑ j [k] and a stochastic term

State Space Analysis
As explained, (22), describes only a single dimension j. Similarly, the following analysis will initially be performed on a single dimension j and at the end all dimensions will be combined to describe the behavior of the full velocity and acceleration vectors. Figure 2a-c are phase-space plots that describe the effect of the linear dependencies of the system (i.e., the deterministic term) for ω = 0, 0 ≤ ω < 1 and ω = 1. In these plots, each state vector (e.g.,ẑ j [k],Mẑ j [k]) describes a single point. The arrows in the phase-space plots show the direction of change fromẑ j [k] toMẑ j [k] and the length of the arrows is proportional to the magnitude |Mẑ j [k] −ẑ j [k]|. The arrows in the phase-space plots show that the linear dependencies described byM always cause the state vectorẑ to asymptotically converge towards the origin for 0 ≤ ω < 1.
The phase-space plots do not show the exact location ofMẑ j [k] on the plots. It can be shown thatMẑ j [k] always lies on the line For proof, see Appendix A.
1. An example of this can be also seen in Figure 3b. In all three cases the system is stable. Also, in (a) (extreme case) and (b) (normal case), the system is asymptotically convergent towards the origin. , the pointMẑ j [k] will always be located closer to the origin, lying on a 1 . The pointẑ j [k + 1] will always be located in between the lines a 2 and a 3 . The vectorb j [k] is always parallel to the hatching lines of the shaded-hatched region, which have gradient 1 ∆t . The shaded-hatched region represents all possible statesẑ j [k + 1]. For this system, A + j = 4 m/s 2 , A − j = 8 m/s 2 and U j = 10 m/s.

Derivation of Extreme Cases
Lastly, the purpose of this analysis was to identify the relationship between maximum velocity and acceleration and the values of ω andĉ. It can be shown that there will always exist an asymptotically maximum velocity U j ≥ 0 given by The stars (#) represent the locations with velocity U j or −U j . Similarly, it can be shown that there will always exist a maximum acceleration A + j ≥ 0 given by and an asymptotically maximum deceleration A − j ≤ 0 given by For proof see Appendix A.3. The squares (I) in Figure 3b represent the points of maximum deceleration A − j and the diamonds (N) the points of maximum acceleration A + j . To find the maximum magnitude U of the velocity vector u it is necessary to equate each of its components to U j , such that Similarly, Finally, it is now possible to find expressions for the behavior coefficientĉ and inertia weight ω in terms of A + , A − , U, d and ∆t that consider the physical capabilities of the robotŝ Equation (32) can be difficult to use in its current form. To simplify it, a new parameter β ∈ (0, 1] is introduced. In (32), it must be the case that A + (∆t) ≤ U and A − (∆t) ≤ 2U, in order for 0 ≤ ω < 1 to be satisfied. That means that the larger the maximum acceleration is compared to the maximum velocity, the smaller ∆t must be to ensure stability of the system. Therefore, A + and A − can be also expressed using Following this, it can be made sure that ω satisfies the conditions of (16) using

Generalized Adapted PSO
The analysis presented can now be applied to a more general velocity update equation that has an arbitrary number of n acceleration terms. Let us define the velocity update equation of the Generalized Adapted PSO as where y 1 , y 2 , . . . , y n are locations in the real world and Please note that the coefficients c 4 , c 5 ... c n do not have a name at this stage. Equation (35) is algebraically equivalent to (17) Therefore, it is now possible to apply the analysis presented in this paper, starting from Section 3.3, to the Generalized Adapted PSO algorithm. The rest of this section will outline design guidelines explaining how the Generalized Adapted PSO can be tuned to ensure that it outputs the desired maximum velocity and acceleration. Afterwards, the next section will show how RPSO can be adapted so that it is described by Generalized Adapted RPSO and how the design guidelines can be used to properly implement DVC (see Section 2.2.1) to avoid collisions with obstacles.

Guidelines
It is now possible to provide a set of design guidelines for the application of the Generalized Adapted PSO to swarm robotic tasks. The parameter selection steps are as follows: 1.
Identify the controller loop delay: ∆t needs to be large enough to accommodate the time delay introduced by computationally expensive tasks and communications between robots.

2.
Identify U: The desired maximum speed of the robot. It must be made sure that this does not exceed the actual maximum speed that the robot can achieve.

3.
Calculate either A + or A − : The desired maximum acceleration or deceleration using (33). It must be made sure that they do not exceed the actual maximum acceleration or deceleration that the robot can achieve.
Traditional guidelines for PSO tuning in parameter optimization tasks aim to control the convergence properties of the swarm (e.g., faster convergence, exploration/exploitation tendencies etc.). This is because in original PSO, the PSO parameters are directly linked to the convergence behavior of the swarm. In Generalized Adapted PSO though, the PSO parameters are primarily used to provide optimal control of the robots. The guidelines provided ensure that the values of ω, c 1 , c 2 , . . . , c n are properly tuned to provide the desired maximum velocity and acceleration, which will often be the physical maximum velocity and acceleration of the robot. Increasing the values of these parameters further will not result in faster convergence, and it can cause desynchronization between the PSO controller and the robot, resulting in poor control. Therefore, if faster convergence is required, robots with higher maximum velocity and acceleration need to be used. The practitioner can still control the exploration/exploitation tendencies of the swarm, by adjusting the values of c 1 , c 2 , . . . , c n , like in the original PSO, but the limitations and guidelines provided in this paper should also be followed.
Generalized Adapted PSO can become more computationally intensive, the more terms are added to the velocity update equation. Nevertheless, due to the simplicity of the algorithm, it is expected to be computationally simple compared to other tasks that robots usually need to perform (e.g., distance measurement using LiDAR, communication, vision etc.). On the other hand, other computationally intensive tasks may interfere with the operation of Generalized Adapted PSO. To avoid this, the value of ∆t needs to be carefully selected. The timestep size ∆t is a powerful feature of Generalized Adapted PSO that allows the robot to predict its state in the next timestep. Therefore, as long as ∆t accommodates all possible delays, it can ensure optimal control of the swarm.

Application to a Real-World System
In contrast to previous work, the analysis presented in this paper focuses on the timestep-to-timestep behavior of the robots. Therefore, (31) and (32) can be used to dynamically adjust the Generalized Adapted PSO parameters independently for each robot, based on the maximum velocity and acceleration that are desired at any time. This can be achieved by using the guidelines of Section 3.5. This is a novel and important capability, and an absolute necessity for the implementation of DVC. To showcase the importance of this capability, and the power of DVC, several simulations were performed.

Implementation of Dynamic Velocity Control
The following algorithm aims to dynamically adjust the RPSO parameters at each timestep, to control the maximum velocity of each robot. The algorithm forces the robot to move at lower speeds, the closer it is to obstacles or other robots. When implemented correctly, this can prevent collisions with obstacles and other agents.
First, it is important to bound each accelerating term of (9), using the sgn function. Therefore, the RPSO velocity update equation becomes Now (31) and (32) can be used to dynamically re-calibrate the RPSO parameters at every timestep (given thatĉ = c 1 + c 2 + c 3 ). Thus, the robot can slow down in the presence of an obstacle so that it is easier to avoid. In order to perform this dynamic re-calibration s (the list of distances to the nearest obstacles) is sorted in order from smallest to largest, such that s 1 is the distance to the closest obstacle. Similarly, let be a vector of speeds, such that ν 1 is the minimum speed required for the robot to collide with the closest obstacle in the next timestep. The value ν p can be used to prevent collision with obstacle p by selecting a desired maximum speed U using where 0 < α < 1. The value of α can be reduced to accommodate for a larger error in odometry and distance measurements. From here, the RPSO parameters are calibrated using the guidelines of Section 3.5. The calibration strategy follows three main steps.
Step 1: The sum c 1 + c 2 must be small enough to ensure that at least for the next timestep, it will be impossible for the robot to collide with the closest obstacle. In this case, c 3 is assumed to be 0 andĉ = c 1 + c 2 . This addresses the case where the robot is not repelled by the closest obstacle at a specific timestep, because r 3 happens to be small.
• Usingĉ = c 1 + c 2 , calculate c 1 and c 2 based on the desired ratio c 1 c 2 .
Step 2: The sum c 1 + c 2 + c 3 must be small enough to ensure that at least for the next timestep, it will be impossible for the robot to collide with the second closest obstacle. This prevents the problematic case where while being repelled by the closest obstacle, the robot ends up colliding with another obstacle. In this case,ĉ = c 1 + c 2 + c 3 . As before, • Set the desired maximum speed U = α × s 2 . • Using (33), calculate the desired maximum acceleration A + . Please note that β needs to be the same as in the previous step, in order to result in the same value for ω as described in (34). • Using (31),ĉ = A + (∆t) Step 3: The inertia weight ω can be calculated using (34). One important characteristic of this calibration strategy is that when s 1 = s 2 , then c 3 = 0. This means that when the robot is at an equal distance from two obstacles, there is no repulsive effect on the robot, allowing it to pass through the obstacles. This will happen no matter how big the opening is between the obstacles, as long as the robot can fit through it.

Results
To demonstrate the performance of the DVC algorithm proposed in Section 4.1, a number of simulations were performed in MATLAB and Gazebo [27]. The MATLAB simulations were idealized, whereas the Gazebo simulations included a more detailed real-time physics model where the inertia of the robots is applied. The algorithms compared in the simulations are: • The original RPSO algorithm described by (9) with constant parameters. • The adapted RPSO algorithm described by (38) with constant parameters. • The adapted RPSO algorithm described by (38) with DVC.

World and Robot Description
The world used in all simulations is shown in Figure 4, the blue square is the area where the robots are initialized, the red square is the target (global minimum of the cost function, the cost is equal to the distance between the robot and the source) and the circles are obstacles. The obstacles become denser the closer to the target.
Swarm intelligence algorithms such as PSO are inherently scalable [13]. That means that the same algorithm should be applicable to large swarms (>100 robots) and small swarms (<20 robots) without additional tuning. That said, there is a minimum number of robots required for a swarm to be effective. In this paper, it was chosen to demonstrate the new PSO algorithms on a small swarm of 6 robots. The robots of the swarm are based on the Robotnik Summit XL Steel platform [28], a popular robotic platform with available specifications and simulation models (e.g., Gazebo models). The robots can move within the two-dimensional world, and collisions with obstacles or other robots can occur. If a collision occurs the robot is considered to become disabled and cannot move any further. The robots are assumed to have unlimited communication range and bandwidth, and each robot can communicate with every other robot in the swarm at all times. A low-cost obstacle-detection short-range LiDAR sensor can have maximum range starting from 4 m [29], so the obstacle-detection range was set to 3 m, to avoid operating at the sensor's maximum range. Due to the way that the Gazebo simulations operate, a robot can detect obstacles in its detection range even if they are "hidden" behind other obstacles. This contradicts how a LiDAR would detect obstacles and it can result in multiple repulsive forces being exerted from the same direction. To avoid this, each robot separates its surroundings into six equally sized radially separated regions. Only the repulsive forces exerted by the closest body in each region are accounted for the calculation of the total repulsive force.
The robots are limited to 3 m/s maximum speed (the actual maximum speed of the Summit XL Steel). The actual maximum acceleration of Summit XL Steel is not available, but it is assumed to be very high, since it uses electric motors which are characterized by high acceleration. To achieve a large A + , the weight β must be also large and therefore ω needs to be small (see (33) and (34)).
The cognitive coefficient c 1 allows the robot to explore the environment around it and overcome obstacles, while the social coefficient c 2 encourages the robot to move towards the global best location y g . As both coefficients are of importance in source localization tasks it is assumed that The values for c 1 , c 2 and c 3 can be either constant or dynamic at every timestep. Both scenarios will be studied in the following simulations.

Original RPSO with Constant Values
For the original RPSO algorithm, the magnitudes of the single values c 1 , c 2 and c 3 rarely matter. This is because it is very easy for the resulting velocity of the algorithm to be higher than the maximum velocity of the robot. Instead, what matters is the ratio c 3 c 1 +c 2 , as it is also suggested by the original RPSO work, since this will control the direction of the requested velocity. In order to allow direct comparison between the algorithms, the same cases will be used for the original RPSO, as for the adapted RPSO with constant values.

Adapted RPSO with Constant Values
In the case of the adapted RPSO, all the terms of the velocity update equation are bounded using the sgn function. Therefore, it is possible to tune the parameters using (31) and (32). The parameters are tuned so that the desired maximum velocity U is always equal to the physical maximum velocity of the robot. The desired maximum acceleration A + can be calculated using (33). From here, it is now possible to calculate the values of c 1 , c 2 and c 3 , based on the desired value of the ratio c 3 c 1 +c 2 . For the following simulations, three cases will be tested: c 3 ≈ c 1 + c 2 , c 3 c 1 + c 2 , and c 3 c 1 + c 2 .

Adapted RPSO with DVC
For adapted RPSO with DVC, c 1 , c 2 and c 3 will be recalibrated at every timestep for each robot depending on its current state, as explained in Section 4.1. Lastly, the simulations were created to resemble typical real-world robotic applications (e.g., a swarm of drones navigating through a forest or a city). The values of some of the parameters used (α, β, γ, ∆t and ω) were selected heuristically to match such applications. Changing the value of the parameters β, γ, ∆t and ω is expected to affect all cases in the same way. In the case of α, it is only used by one of the tested cases (adapted PSO with DVC) and optimization of this parameter is beyond the scope of this paper. Table 1 shows the values assigned to these parameters throughout all the simulations, along with the values of the parameters c 1 , c 2 and c 3 for each tested case.
To assess the performance of each swarm, the fitness of the swarm is calculated using where the right-hand side of the equation is the horizontal distance from the origin to the Center of Mass (CoM) of the swarm. As previously mentioned, robots that have collided with obstacles or other robots are considered to be "collided". The percentage of robots that have collided by the end of the simulation is also used as a secondary metric.

MATLAB Simulations
MATLAB was used to simulate the swarm over 100 repeats, with no physics engine. This number of simulations was selected because it could produce a clear behavioral trend for each algorithm. Figure 5a shows the median CoM fitness over time for each algorithm and Figure 6a shows the median number of collided vs operational robots at the end of the simulation.
As it can be seen from the results of Figure 5a, with adapted RPSO with DVC, the CoM of the average swarm manages to pass through the fifth layer of obstacles (fifth dotted line) before the end of the simulation. This contrasts with the other algorithms that do not manage to pass through the third layer. All cases of the original RPSO appear to progress quickly at the beginning, this is in fitting with the fact that the original RPSO almost always operates at the maximum velocity permitted by the physical constraints of the robot. On the other hand, all cases of the adapted RPSO (including DVC) progress more slowly.
In Figure 6a, it can be seen that all cases of both the original RPSO and the adapted RPSO follow the predicted behavior, i.e., as c 3 gets larger compared to c 1 + c 2 , the number of collisions decreases. For small c 3 , adapted RPSO results in only collisions, while for large c 3 , it results in no collisions. All the cases of original RPSO however have very low survivability. Lastly, the only cases that end up with absolutely no collisions are the adapted RPSO with large c 3 and the adapted RPSO with DVC. Comparing the two cases in Figure 5a, it can be seen that the adapted RPSO with large c 3 has the lowest overall fitness out of all cases. In contrast, the adapted RPSO with DVC has the highest overall fitness. This shows that the adapted RPSO with DVC completely overshadows all other cases, both in terms of fitness and robot survivability. This is attributed to the DVC strategy used. The strategy makes use of the velocity boundaries introduced by adapted RPSO, to slow down a robot in the presence of an obstacle, making it practically impossible to collide with any obstacles or other agents. At the same time, the robot is still capable of navigating through small openings; a capability that is not shared by the other two algorithms.

Gazebo Simulations
To expand on the results of the MATLAB simulations, a number of more realistic simulations were run in Gazebo (benefiting from a detailed Physics Engine). The robots are equipped with contact sensors to detect collisions and with mecanum wheels for holonomic motion. The motion of the robots is governed by the forward and inverse kinematic equations of the mecanum wheels [30]. The robots move by forces being applied on them and they have inertia. Please note that for the Gazebo simulations, apart from the high-level swarm behavior controller described by each studied RPSO case, each robot also employs underlying low-level motor controllers and data collection nodes that operate at a higher refresh rate. The timestep used for these controllers is ∆t = 0.1 s. Figures 5b and 6b show the median CoM fitness over time and the median number of collided vs operational robots for each tested case, respectively. Comparing Figure 5a,b, it can be seen that there are small differences. Specifically, there is a small reduction in the overall performance of the adapted RPSO cases, which can be probably attributed to the imperfect motion of mecanum wheels (i.e., the maximum speed of the robot is limited when it is moving towards certain directions). However, the adapted RPSO with DVC performs better than the other algorithms, reaching an average fitness of 47 by the end of the simulation. When it comes to Figure 6a,b, the results look almost identical for all cases. The original RPSO with c 3 c 1 + c 2 appears to have limited survivability that is not observed in the MATLAB simulations.

Discussion
This paper has introduced a modified version of the Particle Swarm Optimization algorithm, called Adapted PSO, for use as a robot controller in robotic swarms. This was achieved by bounding the terms of the PSO velocity update equation using the sgn function and by including the timestep size ∆t. A PSO parameter selection process was also formalized, by analysing the timestep-to-timestep behavior of a PSO particle. The new parameter tuning equations offer direct control of the desired maximum velocity and maximum acceleration of each robot.
To validate the proposed changes, another modified PSO algorithm called Robotic-PSO (RPSO) that includes obstacle avoidance, was adapted according to the proposed guidelines. The parameter tuning equations were also used to dynamically retune the parameters of Adapted RPSO in real time; a process called Dynamic Velocity Control (DVC). Adapted RPSO was compared to original RPSO in simulations, and it was shown to offer significantly better control over the swarm. Adapted RPSO with DVC was able to navigate inside a difficult environment of obstacles without resulting in any collisions. This contrasts with the original RPSO which was not able to navigate far into the environment and almost always resulted in collisions.
The inclusion of the sgn function effectively bounds the terms of the velocity update equation, addressing Problem 1. The inclusion of the parameter ∆t in the parameter tuning equations solves the synchronisation problem between the PSO position and velocity update equations, addressing Problem 2. The effect of this can be directly seen in the results of Sections 5.2 and 5.3, where ∆t = 1 s. Such a large loop delay is typically unsuitable for most applications of this scale (1-100 m) as it can result in collisions and poor control of the robot. That being said, adapted RPSO with DVC controls the swarm such that robots of diameter 1 m can pass through openings of size 1.2 m without any risk of colliding, even when such a large value of ∆t is used.
In its current form, the PSO controller presented in this paper can be used for tasks that require the swarm to move to a certain location by setting the global best location y g as that location. It is furthermore possible to use the algorithm in several source localization and tracking tasks (i.e., source localization using olfaction). That being said, for PSO to be used as a generalized source localization algorithm, Problems 3 and 4 still need to be overcome. Nevertheless, there already exist candidate solutions. For example it might be possible to overcome the limitation that PSO needs to operate in an immutable environment (Problem 3), by slowly increasing the cost of the current personal best and global best locations exponentially, forcing the robot to re-update them. The performance of these candidate solutions can now be properly implemented on simulated and physical swarms using the methods proposed in this paper, ensuring that the operation of the swarms will not be affected by Problems 1 and 2.
Finally, the adaptation of RPSO in this paper provides a direct application of the adapted PSO algorithm presented. The performance of Adapted RPSO with DVC in Sections 5.2 and 5.3 shows that it is possible to incorporate different tasks into the Generalized Adapted PSO velocity update equation as individual terms and control them by re-calibrating the PSO parameters at each timestep. Similarly, it might be possible to incorporate other tasks (e.g., flocking, target trapping and pattern formation) into the Generalized Adapted PSO equation as additional PSO terms. Each task is run separately from the rest and they are all merged by the Generalized Adapted PSO velocity update equation. The practitioner needs only to identify a simple strategy for the re-calibration of the PSO parameters that will control which tasks are given more importance in different situations. In this way, the Generalized Adapted PSO velocity update equation takes the form of a general swarm control framework for robotic swarms. Such a framework could eventually lead to the standardization of swarm intelligence algorithms.
The work presented is generalized, and there exist many parameters that can affect the behavior of the robot that could be further included into the tuning process. Some of the most important are minimum turning radius, maximum rotational speed and maximum rotational acceleration. These characteristics represent significant obstacles in the use of PSO as a controller for the traditional non-holonomic robots (e.g., differential drive, traditional steering, forward flight etc.).

Conclusions
This paper introduced a new PSO algorithm called Adapted PSO and formalized parameter selection guidelines that specifically enable the application of PSO as a controller in robotic swarms. This has been achieved by considering the physical properties of the robots, including the desired maximum velocity and acceleration, and relating these to the inertia weight and the cognitive and social coefficients via a state model. Coupled with the introduction of the controller loop delay, these new guidelines also guarantee both order-1 and order-2 stability. Thus, solving the two key problems (lack of constraints and asynchronous control) that have so far limited the formal application of PSO to the control of robotic swarms. The new algorithm is compared to original PSO in simulations, and it is shown to excel both in terms of navigation through a difficult environment and robot survivability.
Future work should include further physical limitations, such as angular acceleration, and the application of formalized parameter selection techniques for the tuning of other variations of PSO (e.g., [14][15][16]).

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. Appendix A. Lemma 1 Lemma 1. For allẑ j [k] ∈ R 2 ,Mẑ j [k] will always lie on the line given by:

Proof. The matrixM has eigenvectors
, a, b ∈ R and respective eigenvalues λ − = 0, λ + = ω MatrixM is diagonalizable, since it is of size 2 × 2 and has 2 distinct eigenvalues. Therefore, the column space ofM is fully described by the span of the eigenvectors that are associated with non-zero eigenvalues as shown below C(M) = span({v + }) for 0 ≤ ω < 1 This implies that C(M) is a line and its characteristic equation is given by and the vectorMẑ j [k] will always lie on it.