A Lyapunov-Based Extension to Particle Swarm Dynamics for Continuous Function Optimization

The paper proposes three alternative extensions to the classical global-best particle swarm optimization dynamics, and compares their relative performance with the standard particle swarm algorithm. The first extension, which readily follows from the well-known Lyapunov's stability theorem, provides a mathematical basis of the particle dynamics with a guaranteed convergence at an optimum. The inclusion of local and global attractors to this dynamics leads to faster convergence speed and better accuracy than the classical one. The second extension augments the velocity adaptation equation by a negative randomly weighted positional term of individual particle, while the third extension considers the negative positional term in place of the inertial term. Computer simulations further reveal that the last two extensions outperform both the classical and the first extension in terms of convergence speed and accuracy.


Introduction
The concept of particle swarms originated from the simulation of the social behavior commonly observed in animal kingdom and evolved into a very simple but efficient technique for global numerical optimization in recent past. The Particle Swarm Optimization (PSO) [1,2], as it is called now, does not require any gradient information of the function to be optimized, uses only primitive mathematical operators and is conceptually very simple. PSO emulates the swarming behavior of insects, animals herding, birds flocking and fish schooling, where these swarms forage for food in a collaborative manner. PSO also draws inspiration from the boids method of Craig Reynolds and Socio-Cognition [2].
Since its inception, the research on PSO has centered on the improvement of the particle dynamics and the algorithm. Shi and Eberhart incorporated the inertia factor [3] in the basic PSO dynamics for faster convergence of the algorithm. Clerc and Kennedy [4] considered in their work an alternative form of PSO dynamics using a parameter called constriction factor, and gave a detailed theoretical analysis to determine the value of the parameter. Eberhart and Shi compared the effect of inertia factor and constriction factor on PSO performance [5]. Angeline [6] introduced a form of selection operation in the PSO algorithm, so that the characteristics of good particles are transferred to the less effective members of the swarm to improve their behavior. Suganthan [7] employed a neighborhood operator in the basic particle swarm optimization scheme to study the swarm behavior. Extension of the PSO algorithm to deal with dynamic environment and efficient explorations are undertaken in [8,9]. Ratnaweera et al., while proposing a new model of self-organizing hierarchical PSO [10], ignored the term involving inertia factor from the velocity adaptation rule. Another contribution of this paper is the inclusion of time-varying inertia weight and time-varying acceleration coefficients for better performance of the algorithm.
In [11], a new crossover operator is defined to swap information between two individuals in order to determine their next position on the search landscape. Miranda et al. in [12] proposed a mutation operator on the parameters of the PSO dynamics and the position of the neighborhood best particle, so as to enhance the diversity of the particles, thereby increasing the chances of escaping local minima. In [13], the inertia weight is mutated and the particles are relocated when they are too close to each other. A further increase in the diversity of the population has been attained in [14,15] through introduction of a new collision-avoiding mechanism among the particles. Xie et al. [16] added negative entropy to the PSO to discourage premature convergence. In [2], a cooperative PSO (CPSO) is implemented to significantly improve the performance of the classical PSO. Hendtlass et al. [17] combined Ant Colony Optimization with PSO to determine the neighborhood best of a particle from a list of best positions found so far by all the particles.
Most of existing works on PSO refer to single objective optimization problems. Coello et al. first proposed a formulation of multi-objective optimization problem using PSO [18], and later extended it to include a constraint-handling mechanism and a mutation operator [19] to improve the power of exploration of the optimization algorithm. Agrawal, Panigrahi and Tiwari [20] in a recent paper proposed a fuzzy clustering-based PSO algorithm to solve the highly constrained environmental/ economic dispatch problem involving conflicting objectives.
There exists an extensive literature on improving the performance of the PSO algorithm. This has been undertaken by two alternative approaches. First, the researchers are keen to improve swarm behavior by selecting the appropriate form of the swarm dynamics. Alternatively, considering a given form of particle dynamics, researchers experimentally, or theoretically, attempted to find the optimal settings of the range of parameters to improve PSO behavior. In this paper, we adopt the first policy to determine a suitable dynamics, and then attempted to empirically determine the optimal parameter settings.
The classical PSO dynamics adapts the velocity of individual particles by considering the inertia of the particle and the position of local and global attractors. The positions of the attractors are also adapted over the iterations of the algorithm. The motion of the particles thus continues until most of the particles converge in the close vicinity of the global optima. In this paper, we consider different versions of the swarm dynamics to study the relative performance of the PSO algorithm both from the point of view of accuracy and convergence time.
The formal basis of our study originates from the well-known Lyapunov's theorem of classical control theory. The Lyapunov's theorem is widely used in nonlinear system analysis to determine the necessary conditions for stability of a dynamical system. In this paper, we indirectly used Lyapunov's stability theorem to determine a dynamics that necessarily converges to an optima of the Lyapunov-like search landscape. The principles of guiding particle dynamics towards the global and local optima, here too, is ensured by adding local and global attractor terms to the modified PSO dynamics. The rationale of selecting a dynamics that converges at one of the optima on a multimodal surface, and the principle of forcing the dynamics to move towards local and global optima together makes it attractive for use in continuous nonlinear optimization.
There are, however, search landscapes that do not possess the necessary characteristics of a Lyapunov surface. This calls for an alternative dynamics, which maintains the motivation of this research but can avoid the restriction on the objective function to necessarily be Lyapunov-like. A look at the dynamics constructed for Lyapunov-like benchmark functions essentially reveals an inclusion of a negative position term in the velocity adaptation rule. This prompted us to realize different variants of the classical PSO dynamics, such as (a) replacement of the inertial term by a negative partial derivative of the Lyapunov-like search landscape, (b) inclusion of a negative particle position in the velocity adaptation rule, (c) replacement of the inertial term by the negative positional term in the dynamics. Computer simulations undertaken on a set of eight benchmark functions reveals that the modifications in the PSO dynamics results in a significant improvement in the PSO algorithm with respect to both convergence speed and accuracy. Note that the extensions developed in this article are primarily meant for fast and accurate optimization of continuous and differentiable functions, as all of them involve first derivatives of the objective function to be used.
The rest of the paper is organized as follows. Section 2 provides a set of formal definitions on Lyapunov stability of nonlinear dynamics. It explains the basis of selection of a dynamics for a given Lyapunov-like objective function. The rationale of speed-up of the proposed swarm algorithm using the selected dynamics is given in this section. Experimental results over several numerical benchmarks are presented in Section 3. Finally the paper is concluded in Section 4.

Proposed Extensions of the Classical PSO Dynamics
In this section, we briefly outline one typical PSO dynamics, and the PSO algorithm. We next present the possible modifications that we need to undertake in the dynamics to study their relative performance with the classical PSO algorithm.
The global-best (g-best) PSO dynamics for the j th particle is given in vector form through Equations 1 and 2: where: is a D-dimensional personal (local) best position vector of particle j, so far achieved until iteration t , is a D-dimensional global best position vector found so far by the entire swarm at iteration t .
Empirically, ω is a random no. in [0,1], α l (t) and α g (t) are random coefficients in [0, 2] and [0, 4] respectively. Inertia factor ω is selected randomly only once in the PSO algorithm, whereas α l (t) and α g (t) are selected randomly in each iteration of the PSO algorithm.
The basic PSO algorithm is presented here for convenience of the readers. The notion of time t is dropped from the algorithm for simplicity. A look at the PSO algorithm reveals that it attempts to determine the optima on a search landscape by allowing several particles (agents) to explore on the surface with an ultimate aim to terminate at the global optima. The terminating condition usually includes an upper limit on the iterations or a lower limit to the unsigned successive difference in the best particle position, or whichever occurs earlier.

PSO-Algorithm
In the next sub-section, we would look for a dynamics that has a tendency to move towards optima, which need not essentially be the global optima. This can be attained by identifying a suitable dynamics that ensures asymptotic stability in the vicinity of an optimum over the search landscape. This, of course, needs additional restriction on the surface to satisfy the necessary conditions to be Lyapunov-like [24]. If a suitable dynamics ensuring the convergence to an optimum is identified, we can control the motion of the particles towards the global/local optima by adding global and local attractors in the dynamics as used in the PSO dynamics.

Identifying a Stable Dynamics for a Lyapunov-like Surface
This section begins with a few definitions, available in the standard literature in Nonlinear Control Theory in [21][22][23][24], to explain the methodology of determining a stable dynamics for a Lyapunov-like surface. In what follows we shall use the following special notations: ||X|| to define the Euclidean norm of a vector.

S
to denote an ε-neighborhood surrounding a point defined by the position-vector e X .
   S is basically a set containing all the points in the vector space for which becomes zero at X = X e for any t. The equilibrium state is also called equilibrium (stable) point in D-dimensional hyperspace, when the state X e has D-components.

Definition 2.2. A scalar function V(X)
is said to be positive definite with respect to the point X e in the region K e   X X , if V(X) > 0 at all points of the region except at X e where it is zero. Note that similar as definition 2.2, the scalar function V(X) is said to be negative semi-definite if -V(X) is positive semi-definite.

Definition 2.4. A scalar function V(X)
is called a Lyapunov surface with respect to the origin, if it satisfies the three conditions listed below: The sufficient condition for stability of a dynamics can be obtained from the Lyapunov's theorem, presented below.
Lyapunov's Stability Theorem [21]: Given a scalar function V(X) and some real number 0   , such that for all X in the region    S the following conditions hold: has continuous first partial derivatives with respect to all components of X Then the equilibrium state e X of the system dX/dt = f(X(t)) is a) asymptotically stable if dV/dt is negative definite, and b) asymptotically stable in the large if dV/dt is negative definite, and in addition, be a Lyapunov energy function for the given dynamics Here, V(X) satisfies the first two criterions indicated in the theorem, and the partial derivatives 1 / Vx  and 2 / Vx  are also continuous functions of x 1 and x 2 . Consequently, the asymptotic stability of the dynamics is ensured as / dV dt is found to be negative definite for all points except at x = 0. Further, as   X , V(X) also approaches infinity. Therefore, the asymptotic stability of the dynamics in the large is also ascertained.
The condition for asymptotic stability, as indicated in Theorem1, can be applied to the particle swarm optimization to ensure stability of the dynamics, thereby reducing the convergence time of the algorithm.
When all the three underlying conditions of a Lyapunov function, indicated in Definition 2.4 are supported by the objective function, we would be interested to determine the dynamics that satisfies the necessary conditions for asymptotic stability of the dynamics. It follows from Lyapunov's Theorem that the asymptotic stability of an equilibrium state guided by the dynamics dt dx i / is ascertained if: The inequality (3) essentially holds when: It is indeed important to note that the condition (4) holds for the i-th dimension of a particle roaming over the Lyapunov-like surface for D i   1 .

Example 2
In this example, we would like to determine a stable dynamics for a Lyapunov-like objective function. Consider for instance the Griewank function in D-dimension: In order to have asymptotic stability of the dynamics, we set: It is also apparent to note that the given function f(X) satisfies the three necessary conditions of a Lyapunov function. Now, if we replace the term involving inertia factor by the obtained value of dx i /dt in the PSO dynamics, then the PSO is expected to converge very quickly as the necessary condition for asymptotic stability has been satisfied while deriving the dynamics: Table 1 provides a list of eight typical benchmark functions along with the derived expressions for dx i /dt that ensures the asymptotic stability of the derived dynamics over the Lyapunov-like objective function.
We now define Lyapunov-based PSO dynamics (LyPSO) by adding the local and global attractor terms of classical PSO to the derived expression for asymptotically stable Lyapunov dynamics, given in Equations (5) and (6): denotes the gradient of the scalar function f to be optimized.
The first term in the right hand side of Equation (5) ensures motion of the particle towards minima, while the second and third term controls the motion towards local and global optima respectively. It is apparent from Table 1 that / i dx dt obtained for different Lyapunov-like surfaces include a factor of (-x i ). The condition for / i dx dt = -ωx i is tabulated for all the eight benchmark functions in Table 2. Consequently, instead of computing / i dx dt by the approach stated earlier, we can simply add a term -ωx i to the i th component of the updated velocity in the classical PSO. The resulting dynamics then looks like Equations (7) and (8): The dynamics given by Equations (7) and (8) is referred to as Position-based PSO (PPSO).
Step Function   Step Function When i x is very small.

Ackley's Function
When i x is very small

Unconditional
For the sake of completeness of our study, we consider a third category of the dynamics, where the inertial term is dropped from the PSO dynamics, indicated in Equations (9) and (10). The modified dynamics, called Steepest-PSO (SPSO) for its fast convergence (vide Section 3), is formally given below: In the next section, we would justify the reason for accuracy and speed-up of PPSO and SPSO over classical PSO.

The Rationale of Speed-up of the PPSO and SPSO Dynamics over Classical PSO
To compare the relative performance in speed-up and convergence of the proposed algorithms, we study the stability behavior of the proposed PPSO and SPSO dynamics, in absence of the local and the global attractors. This is performed by solving the first order difference equations. The condition for asymptotic stability and the location of the stable point can be ascertained from the solution of the dynamics. Theorems 1 to 2 provide interesting results, indicating asymptotic stability of the SPSO and PPSO dynamics to the origin irrespective of the search landscape, whereas Theorem 3 indicates asymptotic stability of the classical PSO to a stable point, which need not essentially be the origin. The rate at which the particle position approaches the origin further indicates that the speed of convergence of the SPSO algorithm far exceeds that of PPSO, while the speed of PPSO algorithms beats classical PSO.

Theorem 1:
The dynamics of the j th particle in the i th dimension given by has a stable point at the origin, when 1   .
Proof. Let E be an extended difference operator, such that Now extending the concept of derivatives to the discrete time domain, Equation (11) now can be written as Consequently, the solution of the dynamics (11) is given by where A is a constant. The expression (13) indicates that for

Theorem 2:
The dynamics of the j th particle in the i th dimension given by Is asymptotically stable with a stable point at the origin, when 1   .
Proof. We can rewrite Equation (14) as So, the solution of the dynamics (16) is given by: where A and B are constants. It is apparent from expression (16) that asymptotically converges to the origin for 1   . Therefore, the dynamics is asymptotically stable with a stable point at the origin for 1 for all t. This proves the Theorem.

Theorem 3:
The dynamics of j th particle in the i th dimension given by: is asymptotically stable and it converges to a stable point, which need not essentially be zero. Proof. We can rewrite Equation (17) as: for any positive integer n. Consequently, Equation (18) is transformed to: So, the solution of the dynamics (17) is given by: where A and B are constants. It is apparent from expression (21) that

asymptotically converges to
A as time t approaches infinity. Since A is not zero unconditionally, therefore the statement of the theorem follows.  Figure 1 shows the variation of / i dx dt with respect to time for ω = 0.6 and ω = 0.8. It is apparent from the graphs that in the absence of local and global attractors, the dynamics of SPSO converges faster than that of PPSO, which further converges faster than the classical PSO.

Benchmarks
In order to study the performance of the proposed three alternative PSO dynamics, we used eight well-known benchmark functions as listed in Table 1. All the functions listed here have global minima at the origin except the Rosenbrock function. The performance of the three proposed dynamics for these eight functions is compared with that of classical PSO.

Parametric Range and Error Criterion
Early methods of performance evaluations for evolutionary algorithms were restricted to symmetric initializations. In recent time, researchers prefer asymmetric initialization [25]. We here used the asymmetric initialization method to evaluate the performance of proposed three dynamics along with the classical PSO. In Table 4, we provide the initialization range of the objective function variables, the position of theoretical optima and the error criterion used to terminate the algorithm. Different error criterions were used for different benchmark functions.

Simulation Strategies
Parameter selection of the PSO dynamics also is a crucial issue for speed-up and accuracy of the PSO algorithm. For a given benchmark function, we initially took wider range of the PSO dynamics parameters: α g (t), α l (t) and ω. The initial ranges selected in our simulation were α g (t) in [0, 4], α l (t) in [0, 2] and ω < 1. Several hundred runs of the PSO programs with random parameter settings in the above ranges confirm that for a specific function, the best choice of parameters are restrictive as indicated in Table 5.  The following observations readily follow from Table 5.

Experimental Results from the Simulations
The relative comparison of the convergence time of the three algorithms with respect to classical PSO are given in Figure 2a-h. It is observed from these figures that SPSO always outperforms PPSO in convergence time and accuracy. It is further revealed from these graphs that PPSO yields better performance in accuracy and convergence time with respect to both classical PSO and LyPSO. The performance of the four algorithms is summarized with a '≤' operator, where, x ≤ y indicates that performance of y is better than or equal to that of x. Relative performance:   Table 6 provides the mean error and standard deviation for the globally best particle obtained by execution of the PPSO, SPSO, LyPSO and classical PSO over eight benchmark functions. The error was obtained by taking the Euclidean distance between the theoretical optima and the position of the best-fit particle for a given program run. The mean error designates the average of errors over 50 independent runs. In order to make the comparison fair enough, runs of all the algorithms were let start from the same initial population. The variance denotes the second moment of the errors with respect to the mean error. It is clear from Table 6 that for mean error for the SPSO algorithm is comparable but less than that obtained by PPSO algorithm, and the mean error obtained by the PPSO algorithm is insignificantly less than that of LyPSO algorithm. Further, the mean error obtained by the LyPSO algorithm is less in comparison to that of the classical PSO algorithm. This confirms that the SPSO algorithm outperforms the PPSO and LyPSO and definitely the classical PSO algorithm from the point of view of accuracy in solution. The speed-up of the SPSO algorithm has already been demonstrated in graphs vide Figures 2a-2h. Table 7 shows results of unpaired t-tests between the best and second best algorithms in each case (standard error of difference of the two means, 95% confidence interval of this difference, the t value, and the two-tailed P value). For all cases, sample size = 50 and degrees of freedom = 98. It is interesting to see from Tables 6 and 7 that one or more of the proposed PSO methods can always beat the classical PSO in a statistically significant way.  Table 7. Results of unpaired t-tests on the data of Table 6.
Our experimental results suggest that for multi-modal problems having the fitness landscape punctuated with multiple local optima, the SPSO dynamics is the most preferable choice. However, for uni-modal functions, LyPSO and SPSO are nearly equivalent in terms of their final accuracy and convergence speed. In order to compare the scalability of the proposed PSO-variants against the growth of dimensionality of the search space, we need to plot the no. of fitness function evaluations with dimension of the search landscape. The results shown in figures are average over 50 independent runs of the PSO program. It is clear from Figure 3 that the number of Fitness Function evaluations for PPSO and SPSO do not increase significantly in comparison to that of LyPSO and classical PSO algorithms.    The PPSO, SPSO, LyPSO and PSO algorithms have been executed on eight benchmark functions, and for each algorithm the average of the convergence time for 50 independent runs to meet the error limit for individual function as specified in Table 6 is recorded in Table 8. It is clear from this Table that the mean convergence time of the SPSO is less than that of PPSO. The mean convergence time of PPSO is less than that of LyPSO, and the latter is less than the mean convergence time of classical PSO. The above phenomena is true for all benchmark functions except the sphere, where the LyPSO and SPSO gives identical results because of same functional form in the SPSO and LyPSO dynamics.

Conclusions
Classical g-best PSO has a proven impact in optimization of multi-modal nonlinear objective functions. However, for many nonlinear continuous multi-modal functions, where partial derivatives with respect to objective function variables exist, classical g-best PSO is not very efficient as it does not utilize gradient information of the search landscape. The paper bridges the gap between gradient-free and gradient-based optimization algorithm. It does not truly utilize gradient information of the search space, but it requires the background information that the gradient of the surface exists. When the prerequisite knowledge about the search space is known, we extend the classical g-best PSO algorithm by the principles outlined in the paper.
Three alternative approaches to improve the speed of convergence of the PSO dynamics over continuous fitness landscapes is discussed in the paper. The first approach attempted to replace the inertial term in the dynamics by a factor that ensures asymptotic stability of the PSO dynamics. Construction of such dynamics presumes the characteristics of the surface being Lyapunov-like. This, however, is not a very restrictive assumption as many multi-modal surfaces support the conditions for Lyapunov function. On the contrary, the Lyapunov-based extension, even without local and global attractors, has a natural tendency to move towards optima on the surface. The convergence of the algorithm to local and global optima, however, is controlled by the presence of attractors in the PSO dynamics.
The second alternative approach to make the PSO smarter was derived from the Lyapunov-based formulation, just by noting that the Lyapunov-based dynamics includes a factor of negatively weighted position of the particle. Incorporation of this new term to the existing velocity adaptation rule classical PSO gives birth to the second alternative form of the extended PSO dynamics. The resulting dynamics has been found to have asymptotic stability for a selective range of  < 1, i.e., same as in classical PSO. The third extension lies in replacement of the inertial term by the negative position of the particle itself. A random factor is attached to this term to maintain explorative power of the PSO dynamics to avoid its premature convergence. Computer simulations undertaken ensure that the third alternative form of extended PSO dynamics results in significant improvement in convergence time and accuracy compared to the results obtained by the first and second attempt. However, all three approaches outperform the classical PSO dynamics from the point of view of the convergence time and accuracy.
Future research efforts will focus on the extensions of Lyapunov-based dynamics of gbest PSO for handling non-continuous multi-modal functions. The particle dynamics will be combined with an estimate of the gradient of the function, instead of using any analytical expression of the partial derivatives of the objective function. Also the Lyapunov-based particle dynamics will be examined in context to non-linearly constrained optimization problems, where only a portion of the search space will be used to generate feasible optimal solutions.