1. Introduction
The Particle Swarm Optimisation (PSO) technique was proposed and initially developed by the electrical engineer Russell C. Eberhart and the social psychologist James Kennedy. The method was described in two papers [
1,
2] co-authored by those two authors and published in 1995, one of them having as its title the exact name of the technique they proposed.
This technique had (and still has) a deep connection with some social relations, concepts and behaviours that emerged from a computational study and simulation of a simplified social model of a bird flock seeking for food conducted by those authors, and it belongs to the so-called swarm intelligence, an important and extensive research area within natural computing.
The PSO method is based on the premise that the knowledge lies not only in the social sharing of information among generations but also between elements of the same generation. Although PSO has some characteristics that, in some sense and to a certain extent, have some similarity to those found in other population-based computational models, such as Genetic Algorithms (GA) and other evolutionary computing techniques, it has the benefit of being relatively simple, and its algorithm is comparatively easy to describe and implement.
In fact, its simplicity and apparent competence in finding optimal solutions in complex search spaces led the PSO algorithm to become well known among the scientific community, which contributed to its study and improvement. Thus, many approaches were suggested and different applications were tested with it, especially over the past decade. This review is intended to summarise all the main developments related to the PSO algorithm, from its original formulation up to current developments.
This review is organised as follows:
Section 2 introduces the original PSO approach suggested by Eberhart and Kennedy [
1,
2].
Section 3 presents the most important parameter modifications and the main topological neighbourhood structures used with PSO. In
Section 4, several PSO variants and its applications are presented. Subsequently,
Section 5 introduces a number of hybrid algorithms resulting from combinations of PSO with other artificial intelligence tools. Finally, the last section presents some concluding remarks.
2. Particle Swarm Optimisation
The PSO computational method aims to optimise a problem iteratively, starting with a set, or population, of candidate solutions, called in this context a swarm of particles, in which each particle knows the global best position within the swarm (and its corresponding value in the context of the problem), along with its individual best position (and its fitness value) found so far during the search process in the problem’s solution space.
At each iteration, the velocity and the position of each particle in the swarm, represented by d-dimensional vectors, are influenced by the individual and the collective knowledge, which direct the repeated flights of the particles over the space of possible solutions to the problem in search of the optimum, until a suitable stopping criterion is satisfied.
The velocity of each particle
i in the swarm, at every iteration
t, is updated according to the following equation [
3]:
where
and
are real acceleration coefficients known respectively as cognitive and social weights, which control how much the global and individual best positions should influence the particle’s velocity and trajectory.
In the original PSO algorithm [
2], both
and
are equal to 2, making the weights for the social and cognition parts, on average, equal to 1.
In multimodal problems, where multiple areas of the search space are promising regions, the fine-tuning of these parameters is even more critical to avoid premature convergence.
and are uniformly distributed d-dimensional random vectors, which are used to maintain an adequate level of diversity in the swarm population. Finally, and are, respectively, the personal or individual best position of particle i at iteration t, and the current global best position of the swarm.
In turn, the position of each particle
i, at every iteration
t, varies according to the following equation [
3]:
Note that and can be generated using a uniformly distributed random vector, whereas the particle’s best personal position should be initialised by its initial position; i.e., .
The information about the best personal position (and its fitness value) then flows through the imaginary connections among the swarm particles, making them move around in the d-dimensional search space until they find the best position that fulfils all the problem’s constraints.
These stochastic changes towards the and positions are conceptually similar to the crossover (or recombination) operation, which is the main exploration operation used by GA. However, in PSO, this operation is not necessarily applied by using a random probability.
The PSO algorithm has some advantages when compared to other continuous optimisation techniques; for instance: (i) it does not make assumptions on the continuity and differentiability of the objective function to be optimised; (ii) it does not need to compute the gradient of the error function; and (iii) it does not need good initial starting points or deep a priori knowledge about the most promising areas of the search space.
Besides that, PSO is a problem-independent algorithm; i.e., it can be used in a wide range of applications, since the only information that is needed to know to run the algorithm is the fitness evaluation of each candidate solution (and possibly the set of constraints of the problem).
The PSO algorithm has become better known over time, leading to other studies that extended its original formulation. Many variants have been suggested, such as the adoption of different communication structures (such as the use of ring and star topologies, often referred to as lbest models) as alternatives to the original approach (gbest model), wherein all particles are connected with each other [
4,
5,
6].
The Gbest and Lbest Models
A gbest model swarm, with
s particles, is formally defined as:
where
denotes the position of the best particle in the entire swarm or in its neighbourhood in a
d-dimensional search space, also known as the target particle.
In this model, the information about the new positions found by any particle in the swarm is shared among all the others particles, which turns into a kind of magnet, making all the particles converge to its position.
On the other hand, in a lbest model, the neighbourhood of size
l of the particle
i is defined as:
Although this description of the lbest model assumes essentially a linear ordering of particles in a neighbourhood, which is not sufficiently generic, it is important to note that the neighbourhood may also use a two (or higher) dimensional topology.
The lbest model is formulated as below:
This means that, instead of sharing the information among all the particles in the swarm, the lbest model restricts the knowledge to the particles that are neighbouring each other. When l is set to be equal to s, the lbest model is equivalent to the gbest model.
The selection of the neighbourhood of each particle can be defined by each index i; however, it can be also defined by the distance between them. In this case, the set can be time-varying.
4. Particle Swarm Optimisation Variants
4.1. Cooperative Particle Swarm Optimisation
Due to the similarities between GA and PSO algorithms, some researchers started to propose PSO variants that combined the PSO algorithm with the operations used in GA.
An example of this is the Cooperative PSO (CPSO), a PSO variant proposed by van den Berg and Engelbrecht [
36] and improved by the same authors [
37]. The CPSO algorithm incorporates the concept of cooperation used in GA, wherein all subpopulations have to cooperate by contributing and exchanging information.
They suggested that this concept can also be applied to PSO by using a number of swarms for each dimension, instead of having only one for all dimensions. Thus, each subpopulation has only to optimise a 1-D vector. Although this approach seems simple, some changes on the original algorithm have to be made, especially to the evaluation of the objective function, which still requires a d-dimensional array as input.
Thus, a context vector was used to overcome the problem of the objective function evaluation. This vector is built at every iteration and has a component from each best particle’s dimension. Then, for each component, if the new value is better than the previous one, that specific component of the context vector is updated (and so the best individual fitness value).
The first variant splits the search space into exactly
d subspaces [
36]. On the other hand, and motivated by the fact that components may be correlated, in the
algorithm, proposed later by van den Bergh and Engelbrech [
37], the search space is divided in
k subspaces, where
, which makes it a generalisation of the CPSO algorithm.
The converges to the local optima of the respective subspaces, which makes it more propitious to be trapped in local optima. However, according to those authors, it has faster convergence when compared to PSO.
PSO, on the other hand, is more unlikely to be trapped in local optimum positions when compared with the algorithm, because the optimisation process considers the dimensions as a whole.
Thus,
, a hybrid approach using
and PSO, was suggested by van den Bergh and Engelbrech [
37] to take the advantage of the proprieties of both algorithms, resulting in a fast one with an improved local escape mechanism.
In an overall assessment, the
and
algorithms perform better than PSO both in terms of quality of the solutions found and performance [
37], especially when the dimensionality of the problem increases.
Two Steps Forward, One Step Back
Before getting into the details of the
and
[
37] algorithms, van den Bergh and Engelbrecht [
36] stated one problem with PSO, which they named two steps forward, one step back. They found that, at each iteration, PSO changes the elements of the
d-dimensional vector, making some components move close to the optimal solution, although others can move away from it. Thus, PSO can accept a new candidate solution if its fitness value is lower than the previous one (when considering minimisation problems).
In their paper, they showed an example of this weakness of PSO with a vector with three components, wherein one component already had the optimal value, but its value changed in the next iteration to a poor one. Despite that, the other two components improved, and so did the fitness value.
In this case, two components improved, although one did not, taking the algorithm two steps forward and one step back. To overcome this problem, van den Bergh suggested evaluating the fitness function as soon as a component changes, while keeping constant the other components with the values of the previous iteration.
4.2. Adaptive Particle Swarm Optimisation
In 2009, one important approach for solving both unimodal and multimodal functions effectively, as well as improving the search efficacy and the converge speed of PSO while preserving premature convergence, was proposed by Zhan et al. [
38].
The Adaptive PSO (APSO) presented by those authors defines four evolutionary states for the PSO algorithm: exploration, exploitation, convergence and jumping out, according to the evaluation of the swarm’s distribution and each particle’s fitness. Thus, for each state, different strategies can be applied, such as parameter adaptation.
The swarm’s distribution can be assessed by the mean distance of each particle to all other particles using the following Euclidean metric:
where
s is the size of the swarm and
d is the number of dimensions.
Then, an evolutionary factor,
, is computed by:
where
and
are respectively the maximum and minimum distances among the particles, and
is the value of
of the globally best particle in the swarm.
Based on this factor, the algorithm can be then classified in one of the evolutionary states. For example, a medium to substantial value of indicates the exploration state, while a shrunk value of means exploitation. In turn, the convergence state happens when a minimum value of is reached, and the jumping out state when the mean distance value for the best particle is significantly higher than the mean distance value for the other particles.
An adaptive
-dependent inertia weight was also suggested by the same authors and is given by:
Thus, when is large (jumping out or exploration state), makes the algorithm give more importance to the particle’s self-knowledge, thereby benefiting the global search. On the other hand, when is small (exploitation or convergence state), the swarm’s knowledge is more relevant than the self-knowledge of each particle, giving priority to the local search.
The cognitive and social weights are also changed, according to the evolutionary state, and a Gaussian mutation operation is applied to the best particle in the swarm to enable it to jump out of a local optimum or to refine the global best solution.
If the new position found is better than the best particle’s solution, the new one replaces the best particle’s position. Otherwise, the worst particle’s solution is replaced by this new position.
The velocity and the position of each particle are computed, and as usual, the PSO algorithm keeps iterating until the stopping criterion is met.
When tested with some unimodal and multimodal functions, APSO showed itself to be efficient at improving the convergence speed, and most importantly, at enhancing the accuracy of the algorithm when compared to other well-known approaches.
4.3. Constrained Optimisation Problems
On the other hand, Parsopoulos and Vrahatis [
39] proposed a method based on a penalty function and on the constriction factor for constraint handling with PSO. To the authors’ best knowledge, this was the first paper that proposed a method to use PSO to optimise constrained optimisation problems.
A Constrained Optimisation Problem (COP) can be transformed into an unconstrained problem by using a penalty function that penalises the objective function if the conditions on the variables are not held. Therefore, a single objective function is built and optimised using a standard unconstrained optimisation algorithm.
A penalty function,
, can be defined as:
where
is the original objective function to be optimised,
is a dynamic modified penalty value, and
is the penalty factor defined as:
where
for
,
is a multi-stage assignment function, and
is the power of the penalty function. Note that although the equality constraints
were not considered, they can be transformed into two inequality constraints, such as
and
.
Although COPs can be transformed into unconstrained problems by using a penalty function, they require more parameters to be fine-tuned (in this case, , and ) in order to prevent premature convergence.
Hu and Eberhart [
40,
41] proposed a more straightforward, brute-force method to optimise COPs, known as the Preservation of Feasible Solutions Method (FSM).
In their proposal, all feasible solutions found during the search process in the whole search space are preserved. After a stopping criterion is met, the optimal solution that fulfils all the problem’s constraints may be found.
When these two methods are compared using the same problems, fine-tuning of the penalty function parameters may result in better average optimal solutions when compared to FSM, but the choice of which constraint handling method to be used may be very problem-dependent [
42].
He et al. [
43] introduced into PSO a different constraint handling method, called fly-back mechanism. The idea is simple: when a particle fly to a non-feasible region of the search space, its position is reset to the previous (feasible) position.
On the other hand, Sun et al. [
44] proposed a more advanced approach, in which once a particle enters a non-feasible region, a new feasible position is computed by:
where the coefficient
is a diagonal matrix whose diagonal values are set within the range of
. Thus, if
for
, then this means that
is a feasible position.
If is not in a feasible position, must be adjusted to bring the particle back to a feasible position.
Sun et al. [
44] suggest that
should be found by:
Note that the superscript on the first product symbol includes both the number of inequality constraints, as well as the search space’s boundaries, that are transformed into two inequality constraints.
Then, the algorithm proceeds like the PSO algorithm until a stopping criterion is met.
Results show that this algorithm is suitable for solving COPs. However, it did not perform as well when the optimal values were at the boundaries of the search space.
4.4. Multi-Objective Optimisation
Initially, research on PSO was made considering only the optimisation of one function. However, in real-world problems, it is rare to have only a single objective to optimise, but multiple objectives that should be optimised simultaneously.
At first glance, the different functions can be optimised running the algorithm independently for each of them, but optimal solutions seldom are found, because the objectives may conflict with each other (e.g., price–quality relationship).
The multi-objective optimisation problems can be modelled as finding that minimises .
In most of the multi-objective optimisation problems, there is no single solution that simultaneously optimises each objective but a set of feasible solutions called Pareto optimal solutions, . In other words, there is no feasible vector that would optimise some objective values without penalising at least one other objective value.
This set of feasible solutions forms the so-called Pareto front. The user/researcher is then responsible for choosing what he considers to be the best solution to the problem at hands.
This introduces a notion of dominance, called Pareto Dominance: a vector is said to dominate if .
Hu and Eberhart [
45] proposed an approach to solving multi-objective optimisation problems with a PSO algorithm based mainly on the concept of Pareto optimally.
They presented a dynamic neighbourhood version of PSO, such that, at every iteration, each particle has a different neighbourhood than it had in the previous iteration.
Each particle’s neighbourhood is chosen based on the distances from the current particle to the other particles in the fitness value space of the first objective function to be optimised.
Within its neighbourhood, each particle chooses the local best (lbest) particle, considering the fitness value of the second objective function.
The new is only set when a new solution that dominates the current is found.
Unfortunately, Hu and Eberhart only used two objective functions to describe their proposal and did not provide enough details on how the algorithm was implemented, especially regarding how to compute the distance between particles. Besides that, their proposal, in essence, only optimises one objective function, and nothing guarantees that the optimal solution for the second function is also the optimal solution for the first one.
Coello Coello and his collaborators [
46,
47], on the other hand, introduced the notion of external (or secondary) repository, proposing a PSO variant called Multi-Objective PSO (MOPSO). The external repository stores non-dominated vectors of particles’ positions used to compute the velocity of each particle at each iteration (replacing
in (
6)). This repository is dynamically chosen within each iteration. For example, if none of the elements contained in the external population dominates the new solution found, then such solution is stored in the external repository. They also used a constraint handling mechanism to solve multi-objective constraint optimisation problems with PSO, and a mutation operator to ensure the diversity of the particles, to slow down the convergence speed and to prevent premature convergence to a local optimum.
The constraint handling mechanism can do one of two things if a particle goes beyond the boundaries: either set it to its corresponding boundary, or multiply its velocity by in order to search in the opposite direction.
According to a certain probability, a mutation operator is applied to only one randomly chosen dimension of each particle by changing its value according to the current and total number of iterations, taking into account its boundaries, however. This was the first mutation operation proposed to solve optimisation problems with constraints using PSO.
The algorithm then proceeds as the standard PSO until a stopping criterion is met. The output of the algorithm is a Pareto front, which is built upon each iteration as a grid using the values of the external repository.
The MOPSO approach showed better results than other multi-objective evolutionary algorithms and required low computational time to run the algorithm.
These approaches were the first steps of the research on solving multi-objective parameter optimisation problems using PSO. The MOPSO algorithm was improved by Fieldsend [
48] and later by Mostaghim [
49].
4.5. Multimodal Function Optimisation
Simultaneously, efforts were made to extend the PSO algorithm for multimodal function optimisation; that is, for finding all the global best positions (and eventually other local optimal solutions) of an equation or system of equations.
This type of optimisation is especially useful for the decision makers, so that decisions can be made taking into account, for example, physical and cost constraints, having, however, multiple optimal solutions at hand.
Due to the existence of multiple local and global optima, all these problems can not be solved by classical non-linear programming techniques. On the other hand, when using Evolutionary Algorithms (EA) and PSO, the optimum positions can be found faster than by traditional optimisation techniques [
50].
However, PSO was designed to find only one optimum of a function, and so some changes are required. In fact, PSO can be applied multiple times on the same function to find all the desired minima. Nevertheless, it is not guaranteed that all will be found.
In this type of optimisation, fast convergence can sometimes lead to premature convergence, because PSO (or other evolutionary algorithms) may get trapped into local optima. Thus, it is important to maintain the population diversity before some goal is met.
At first glance, the lbest models can be thought of as potential candidates to find multiple solutions, in which each neighbourhood will represent a candidate solution. However, one particle can be in several neighbourhoods at the same time, causing all the particles in these neighbourhoods to converge to the same point in case that particle has the best fitness among all the points in the neighbourhoods it belongs to. Consequently, if that point is a local optimum, these neighbourhoods will be biased towards that position, making the algorithm converge prematurely.
Thus, many approaches to tackling this kind of problem have been suggested, and the most relevant will be described in the next subsections.
4.5.1. Objective Function Stretching
Multimodal function optimisation with PSO was first introduced by Parsopoulos et al. [
50]. The first version of their algorithm, known as Stretched PSO (STPSO), had the main objective of finding a global minimum of a multimodal function, avoiding the algorithm being trapped in local optima.
To do so, they defined a two-stage transformation on the objective function that is applied to it as soon as a local optimum (minimum) is found, using a function stretching technique.
A function stretching () acts like a filter, transforming the form of the original function in a more flatter surface yet highlighting possible global and local optimum positions.
As already said, this transformation is applied as soon as a local minimum is found, in order to repel the rest of the swarm from moving towards that position. After that, is replaced by and the PSO algorithm is applied until a specific stopping criterion is met.
Parsopoulos and Vrahatis [
51] extended this approach to find all globally optimal solutions and showed that this new approach could be effective and efficient.
They defined a threshold, , related to the requested accuracy so that when the value of the objective function applied to the particle is lower than , this particle is pulled away from the swarm and a function stretching is applied at that point to avoid the rest of the swarm from moving towards that position.
After this transformation, a new particle is randomly added to the swarm, to replace the one that was isolated from it. Then, if the function value of the isolated particle is higher than the desired accuracy, a new sub-swarm is created (which is considered a niching technique), and a new instance of the algorithm is executed, although being conditioned to that search area.
The algorithm stops when the number of global minimisers reaches a known one, or when the number of global minimisers is unknown, at the maximum number of iterations.
Unfortunately, this stretching transformation (that can also be considered as a convergence acceleration technique) may create local minima that were not present in the original objective function. This may require some restarts of the PSO algorithm until a global minimum is found [
52].
Thus, Parsopoulous and Vrahatis [
53] improved their method again by introducing deflection (a technique that incorporates knowledge from previously detected minimisers into the objective function) and a better repulsion technique (which ensures that if a particle moves towards one of the detected local optima, it will be repelled away from it).
4.5.2. Nbest Technique
In 2002, Brits et al. [
52] proposed a new PSO-based technique, known as neighbourhood best or nbest PSO, and showed its successful application in solving systems of unconstrained equations.
A system of equations with
k equations can be transformed into one fitness function:
where each equation is algebraically rewritten to be equal to zero. However, the formulation of the problem using this transformation fails when multiple solutions are present in the search space.
To overcome this problem, they redefined the objective function as the minimum of the fitness function with respect to other equations. That is, as in the example given by Brits and his collaborators, when a system of equations has three equations (
A,
B and
C), the objective function is defined as the minimum of the combinations of those equations:
Thus, particles that are close to one of the solutions are rewarded and do not suffer any penalisation if they are still far from the global best particle.
The nbest technique uses a dynamic neighbourhood approach, based on the Euclidean distance between the particles, to change the biased information towards a single optimal solution.
It is noteworthy that the Euclidean distance is computationally intensive to calculate, and besides that, choosing the neighbourhood based on it led to undesirable convergence properties. Thus, later, Euclidean neighbourhood was abandoned.
After computing the Euclidean distance from each particle to each other one, the neighbourhood of each particle is defined and the centre of mass of the positions is kept as neighbourhood best, and the PSO algorithm proceeds normally until a stopping criterion is met.
The results presented by those authors showed that the nbest technique can find all globally best solutions. However, in real-world applications, the systems of equations to optimise are usually not limited to three equations, and frequently the number of them is much higher. Thus, in such cases, this solution may face performance issues as the number of combinations can increase rapidly.
4.5.3. Subpopulations and Multi-Swarm
Another strand for the neighbourhood structure of communication happens when some subpopulations are watching over the best local optimum. That is, when a local optimum is found, the original swarm is split. One fraction of the swarm remains to explore the local optimum, and the other continues the search on a different portion of the search space [
54].
In natural ecosystems, animals live and reproduce in the same groups of their own species, called niches. Based on this idea, niching techniques were proposed and implemented successfully with GA and latter with PSO.
This type of technique is most commonly used in multimodal search spaces, because groups of individuals can move simultaneously into different search space regions. Note that individuals can be grouped by similar fitness values, by their distance from others or other similarity criteria.
Brits et al. [
31] suggested the first PSO niching technique, named NichePSO, for successfully locating multiple optimal solutions in multimodal optimisation problems simultaneously.
In their proposal, they used a main swarm and a number of sub-swarms, as well as two variants of the PSO algorithm, namely, GCPSO [
30] and the cognition-only model proposed by Kennedy [
55], where Equation (
1) is changed to only include the cognitive weight; i.e.,
thereby allowing each particle to perform a local search, preventing the situation in which all particles get pulled towards a single solution due to the influence of the best particle or particles in the neighbourhood.
The cognition-only PSO variant is run for one iteration in the main swarm. Particles are then grouped by a given accuracy threshold (similar to the one used by Parsopoulos and Vrahatis [
39] in the constriction factor PSO approach), and then, for each sub-swarm, GCPSO is run.
After that, the sub-swarms that are too close can be merged and can absorb particles from the main swarm when they move into them. Finally, the algorithm checks in the main swarm for the need to split it in other swarms, and iterates until a stopping criterion is found (for example, when it reaches a certain number of globally optimal solutions, when known).
Later, Engelbrecht [
34] improved the NichePSO by changing the merging and absorption strategies that were proposed in the original approach. Schoeman and Engelbrecht [
56] proposed a PSO approach (which can be considered as a sequential niching PSO) that uses an additional vector operation, namely, the dot product, to change the direction in which particles should be headed to; viz., towards an already located niche or to explore and search for a new niche. Shortly after that, the same authors [
57] proposed a parallel vector-based approach wherein all particles are updated simultaneously.
Li [
58] extended the FDR-PSO algorithm to multimodal optimisation problems by introducing two mechanisms in the original FDR-PSO: the memory-swarm and the explorer-swarm.
The memory-swarm saves the personal best positions found so far by the population. During its turn, the explorer-swarm saves the current state of the particles and is used to explore the search space.
The best positions in the memory-swarm are used as anchors, and as the algorithm runs, niches are created around the best positions, according to the fitness-Euclidean distance ratio between a particle’s personal best and other personal bests of the particles in the population.
The fitness-Euclidean distance ratio technique is an improved version of FDR that has a scaling factor computed using the worst and best fitted particles in the swarm.
Li et al. [
59] split the population into species, according to the distances between the particles. Based on this idea and the ideas presented in [
60,
61], Parrott and Li [
62] incorporated the concept of speciation into the constriction factor approach of PSO for solving multimodal optimisation problems.
It is important to note that, although different terminology is used, both niching and speciation techniques group similar particles by a given criteria.
In the resulting species-based algorithm, the particles are dynamically and adaptively grouped into species around dominating particles called species seeds, each species being used to track an optimum point.
Li [
63] also presented a niching, parameter-free algorithm with ring topology for multimodal optimisation, which is able to form stable niches across different local neighbourhoods.
Four variants of this lbest PSO niching algorithm with ring topology were also suggested by Li [
63], two of them (r2pso and r3pso) with an overlapping ring topology—the other two variants, namely, r2pso-lhc and r3pso-lhc, being lbest PSO algorithms with a non-overlapping ring topology.
Recently, Yue et al. [
64] improved the lbest PSO niching algorithm by including a Special Crowding Distance (SCD) for solving multimodal multi-objective problems and reported that the algorithm was able to find a more significant number of Pareto-optimal solutions when compared to other well-known algorithms.
4.6. The Fully Informed Particle Swarm Optimisation
In 2004, Mendes et al. [
65] introduced the Fully Informed Particle Swarm (FIPS) optimisation algorithm, because they were convinced that each particle should not be influenced only by the best particle among its neighbours, but all the neighbours must contribute to the velocity adjustment of each particle; i.e., the particles should be fully informed.
They integrated the constriction factor approach of PSO with a new velocity update equation, wherein the social component is not explicitly considered, given by:
Typically
and
. The particle’s individual best position
is given by:
with
where
l is the number of particles in the population, and
is a function that returns a position vector generated randomly from a uniform distribution between 0 and
.
The function
can return a constant value over the iterations, or as Mendes et al. [
65] also did in their experiments, return the fitness value of the best position found by the particle
i or the distance from that particle to the current particle.
Although in this variant all particles contribute equally for the change in the next velocity calculation, those authors also suggested a weighted version of the FIPS algorithm, in which contributions are given according to the fitness value of the previous best position or the distance in the search space to the target particle.
They were in fact right, since both FIPS variants performed well on the considered neighbourhood architectures (except on the all-connected-to-all), finding at all times the minimum of the benchmark functions. The weighted versions require an extra computational cost, and such cost may not be justified, since the unweighted version performed quite well in their study [
65].
4.7. Parallel Implementations of Particle Swarm Optimisation
Besides being trapped into local optima, PSO has another problem: its performance becomes progressively worse as the dimensions of the problem increase [
66]. To alleviate this problem, some approaches were suggested, such as the use of multiple processing units of a computer system to distribute processing among them, creating sub-swarms, and thus speeding up the execution of the algorithm.
As each sub-swarm can be thought to be independent, PSO maps well to the parallel computing paradigm. In this section, a survey of the most common approaches to Parallelized PSO (PPSO) will be described.
For PPSO approaches, a multi-core Central Processing Unit (CPU) or a Graphics Processing Unit (GPU) can be used to process the tasks of each parallel sub-swarm, along with some mechanism to exchange information among them. The exchange of information can be made in a synchronous or asynchronous manner.
Synchronous exchange is made when particles of each sub-swarm are synchronised with each other, i.e., the particles wait for the others to move to the next iteration, leading to the same result as the sequential approach, although its processing is done in parallel. On the other hand, when the exchange of information is made asynchronously, the sub-swarms are independent of each other, and thus, at the end of an iteration, each particle uses the information available at the moment (especially the global best position information) to move to the next position.
In addition, different architectures can be used to control the exchange of information, such as master–slave (where there is one processing unit that controls the execution of the other processing units), fine-grained (in which the swarm is split into sub-swarms and arranged in a 2-D grid, wherein the communication is only made within the neighbours of each sub-swarm) and coarse-grained (where the swarm is also split into sub-swarms independent of each other; however, from time to time, they exchange particles between them) [
23,
66,
67].
Gies and Rahmat-Samii [
68] proposed the first PPSO. They reported a performance gain of eight-fold (when compared with sequential PSO) with the PPSO algorithm for finding the optimal antenna array design. The results of this first work about PPSO motivated other researchers, such as Baskar and Suganthan [
69], who improved the performance of FDR-PSO [
29] by introducing a novel concurrent approach, called CONPSO.
Three communication strategies were presented in [
70,
71] by using the GA’s migration technique to spread the gbest position of each sub-swarm to the others. In the first one, the best particle of each sub-swarm is mutated and migrated to another sub-swarm to replace the poorest candidate solutions. In the second strategy, on the other hand, although similar to the previous one, the exchange of information only happens in neighbour sub-swarms. Finally, the latter solution is a hybrid between the first and the second strategy.
Schutte et al. [
72,
73] used a synchronous master-slave architecture for a bio-mechanical system identification problem. All particles were evaluated using parallel processes; however, all processes had to finish in order to update the next velocities and positions of all particles. Additionally, they reported that the time required to solve the system identification problem considered was reduced substantially when compared with traditional approaches.
As stated by Schutte et al. [
73], synchronous implementations of PPSO are easy to produce. Nevertheless, such implementations usually have a poor parallel efficiency, since some processing units may be idle. Due to this fact, Venter and Sobieszczanski-Sobieski [
74] proposed a master–slave asynchronous implementation PPSO algorithm and compared it with a synchronous PPSO.
One can consider the fact that the behaviour of each particle depends on the information available (possibly not from all other sub-swarms) at the start of a new iteration as a drawback of asynchronous approaches. However, in the authors’ opinion, this can be negligible because, although particles may not have updated information about the best solution before moving to a next position in the search space, communication always exists between particles and sub-swarms. Thus, in further iterations, the information about the best position found so far will inevitably be shared.
Koh et al. [
75] introduced a point-to-point communication strategy between the master and each slave processing unit in an asynchronous implementation of PPSO for heterogeneous computing conditions. This condition happens, for example, when the number of parallel sub-swarms can not be equally distributed among the available processors. In this type of condition, a load balance technique is essential for the robustness of the algorithm.
The results obtained by Koh et al. [
75] were compared with the algorithm presented by Schutte et al. [
73], and showed that the asynchronous implementation performs better, in terms of parallel efficiency, when a large number of processors are used.
In 2007, McNabb et al. [
76] introduced the MapReduce function for the PPSO. This function has two sub-functions: map and reduce.
On the one hand, the map function finds a new position, computes the velocity of the particle, evaluates the objective function on its position, updates the information of the personal best position and shares this information among all dependent particles. On the other hand, the reduce function receives the information and updates the global best position information.
This type of formulation allows the algorithm to be split into small procedures and easily balanced and scaled across multiple processing units, following the divide-and-conquer parallel approach.
Aljarah and Ludwig [
77] proposed a PPSO optimisation clustering algorithm (MR-CPSO) based on the MapReduce approach. This parallel PSO-based algorithm showed efficient processing when large data sets were used.
Han et al. [
78], in turn, included constraint handling in PPSO, whereas Gülcü and Kodaz [
79] proposed a synchronous parallel multi-swarm strategy for PPSO.
In this multi-swarm approach, a population is divided into subpopulations: one master-swarm and several slave-swarms which independently run a PSO variant. However, the slave-swarms cannot communicate with each other, since communication is made through the master-swarm by migrating particles. The parallel multi-swarm algorithm also uses a new cooperation strategy, called Greed Information Swap [
79]. This work was extended by Cao et al. [
80] to include multi-objective optimisation.
Lorion et al. [
81], in turn, proposed an agent-based PPSO that splits PPSO into sub-problems. There are two types of agents: one coordination agent and several swarm agents, which, similarly to the multi-swarm strategy, do not communicate with each other.
Then, a strategical niching technique is used to increase the quality gain. A fault tolerance (e.g., when a processing unit stops responding to requests) was also implemented, by either saving agent’s state in other swarm agents or by using the coordination agent’s information available at the moment about the failed agent.
Along with all these developments, some researchers suggested approaches that used a GPU instead of using a CPU, especially when the CUDA development kit of NVIDIA was released. GPUs are designed for image processing and graphics applications, although they have more processing capacity (since they have more processing elements) than CPUs.
Developing parallel algorithms on a GPU is far more complicated than the corresponding implementations on a CPU [
82]. However, several studies have reported significant improvements in terms of execution time when a GPU implementation of the PPSO is compared to its corresponding implementation on a CPU (see, e.g., [
83,
84,
85,
86]).
A GPU-based fine-grained PPSO was proposed by Li et al. [
59]. In turn, the performance of the Euclidean PSO, proposed by Zhu et al. [
87], was improved by Dali and Bouamama [
88], where a GPU-based parallel implementation of the original algorithm was presented.
Finally, it is also worth mentioning the distributed and load balancing versions of the PSO algorithm on GPU developed by using a grid of multiple threads [
89] or distributed memory clusters [
90], along with the OpenMP API.
6. Conclusions
In the previous sections, a literature review focusing on the PSO algorithm and its variants was presented, describing the most important developments in this field since the introduction of the algorithm in mid-1990s.
The PSO algorithm was inspired by some characteristics of the collective behaviour observed in the natural world, in which elements of a population cooperate with each other seeking to obtain the greatest mutual benefit.
Over the years, the PSO algorithm has gained attention from many researchers due to its simplicity and because it does not make assumptions on the characteristics and properties (such as continuity or differentiability) of the objective function to be optimised.
Inevitably, the algorithm has suffered changes to, for example, improve its effectiveness and efficiency.
The use of different topologies was one of the first suggestions to improve the algorithm. However, a conclusion was reached: the topologies of communication are problem-dependent.
PSO was widely used for different applications, which led to some researchers to report convergence problems with the algorithm. To lessen this problem, changes were made, mostly by the introduction of new parameters, or by combining PSO with other operators or algorithms.
The algorithm has been also extended to solve a panoply of different problems and applications since its original formulation in 1995. Constrained, multi-objective and multimodal optimisation problems were some of the most relevant applications and problems solved with the PSO approach.
To conclude, PSO is one of the leading swarm intelligence algorithms and is superior when compared to other optimisation algorithms in some fields of application. Although it has some drawbacks, those were lessened by using different types of strategies and modifications to the original version of the algorithm. PSO is also a problem-independent algorithm; i.e., it can be used in a wide range of applications due to its great capacity for abstraction, which further highlights its importance.