Heterogeneous Cooperative Bare-Bones Particle Swarm Optimization with Jump for High-Dimensional Problems

: This paper proposes a novel Bare-Bones Particle Swarm Optimization (BBPSO) algorithm for solving high-dimensional problems. BBPSO is a variant of Particle Swarm Optimization (PSO) and is based on a Gaussian distribution. The BBPSO algorithm does not consider the selection of controllable parameters for PSO and is a simple but powerful optimization method. This algorithm, however, is vulnerable to high-dimensional problems, i.e., it easily becomes stuck at local optima and is subject to the “two steps forward, one step backward” phenomenon. This study improves its performance for high-dimensional problems by combining heterogeneous cooperation based on the exchange of information between particles to overcome the “two steps forward, one step backward” phenomenon and a jumping strategy to avoid local optima. The CEC 2010 Special Session on Large-Scale Global Optimization (LSGO) identiﬁed 20 benchmark problems that provide convenience and ﬂexibility for comparing various optimization algorithms speciﬁcally designed for LSGO. Simulations are performed using these benchmark problems to verify the performance of the proposed optimizer by comparing the results of other variants of the PSO algorithm.


Introduction
Many optimization problems in modern engineering, e.g., optimal design and scheduling problems, must be solved with finite resources that should be used efficiently. In particular, most of these problems are high-dimensional and complex [1][2][3]. Therefore, the recent focus of optimization techniques has been on solving complex and high-dimensional problems, as described in [4][5][6][7][8].
The Particle Swarm Optimization (PSO) algorithm [9,10] is a metaheuristic inspired by the social behavior of birds flocking or fish schooling; the algorithm was created by simplifying this social behavior. In the algorithm, the so-called particles find a population of candidate solutions to a given optimization problem by moving in a search space according to simple mathematical formulas that are related to the particles' positions and velocities. The PSO algorithm can search the solution spaces of optimization problems with few or no assumptions about the problems, even those that involve searching relatively large spaces. Additionally, to be solved using the PSO algorithm, a problem need not be differentiable, and PSO can be robustly used with problems that include uncertainties such as noise or changes over time. Therefore, PSO algorithms are widely and frequently used to solve optimization problems because they are simple to implement, stable, and high-performing.  [19] to high-dimensional problems, overcoming several weaknesses with jump strategy [17], cooperative concept [20], and exchange information between heterogeneous swarms. The remainder of this paper is organized as follows: In Section 2, detailed mathematical models of the standard PSO and BBPSO algorithms are briefly introduced, and the cooperative learning and jumping strategies are described. Section 3 provides a detailed explanation of the proposed HCBBPSO-Jx algorithms. In Section 4, we verify the performance of the HCBBPSO-Jx algorithms using the 20 benchmark functions provided by the IEEE Congress on Evolutionary Computation 2010 (CEC 2010) Special Session on Large-Scale Global Optimization (LSGO) to provide convenience and flexibility for comparing various optimization algorithms. Using these large-scale global optimization problems, we compare the results with those of the other PSO variants. Finally, Section 5 concludes this paper.

The Particle Swarm Optimization (PSO) Algorithm
This section describes the mathematical model and the search procedure of the standard PSO algorithm. First, let f : R n → R be the cost or objective function that we should minimize. A candidate solution takes the form of a vector of real numbers, and the output of the function is a real number, which is the value of the objective function for the given candidate solution. The goal of this optimization problem is to find a solution x * such that f (x * ) ≤ f (x) for all x in the search space, which is bounded by the values b l and b u .
Algorithm 1 shows the process of the standard PSO algorithm [9,10]. The parameter N P is the size of the population (called a swarm in the PSO algorithm) or the number of particles. As mentioned above, the parameters w, ϕ p , and ϕ g represent the inertia weight and the acceleration constants, respectively. Each particle has a position x i (k) ∈ R n in the search space and a velocity v i (k) ∈ R n , at time k. The vector p i (k) is the best known position of particle i, and the vector g(k) is the best known position of the entire swarm at time k. These are also called "p best " and "g best ," respectively.
Before the algorithm begins, each particle's position and velocity are initialized with uniformly distributed random vectors, i.e., x i (1) ∼ U (b l , b u ) and v i (1) ∼ U (− |b u − b l | , |b u − b l |), respectively. The vector p i (1) is first initialized with the vector x i (1). Finally, the vector g(1) is initialized with the best of the vectors p(1).

9:
U pdate(v i,d (k)); Using Equation (1) 10: end for 11: U pdate( x i (k)); Using Equation (2) 12: if f ( x i (k)) < f ( p i (k)) then 13: Next, the particles search begins. For each dimension of each particle, an element v i,d is determined as follows: where the parameters r p and r g are uniformly distributed random numbers in the range [0, 1], i.e., U(0, 1). In the first part of Equation (1), the inertial weight w represents the amount of momentum the particles have. The second part is the "cognition" part, which represents the independent behavior of each particle. The final part is the "social" part, which represents the collaboration among the particles. The constants ϕ p and ϕ g determine the relative influences of the cognition and social parts, and eventually, pull each particle toward position p best and g best . The next position of particle i, x i (k + 1), is calculated using Equation (1) as follows: Subsequently, if the value of f ( x i (k)) is smaller than the value of f ( p i (k)), then the vector p i (k) is updated to x i (k). In addition, if the value of f ( p i (k)) is smaller than the value of f ( g(k)), then, the vector g(k) is updated to p i (k).
This process is repeated until a termination criterion is met, e.g., the maximum number of iterations or a solution with an acceptable objective function value is reached.

The Bare-Bones PSO (BBPSO) Algorithm
As mentioned above, the BBPSO algorithm does not have to set up the parameters w, ϕ p , and ϕ g , unlike the standard PSO algorithm. In the BBPSO algorithm, the next position of a particle is determined by sampling a Gaussian distribution whose mean is given by the average globally best position of the swarm(s), g best , and the personally-best position of the particle, p best and whose the standard deviation given by the absolute difference between g best and p best . For each element (or dimension) of a particle, the next position is determined in the BBPSO algorithm using the following equations instead of using Equations (1) and (2), which are for the standard PSO algorithm: where N µ i,d (k), σ 2 i,d (k) is a random number generator based on a Gaussian distribution with mean µ i,d (k) in Equation (4) and standard deviation σ i,d (k) in Equation (5) for the d-th dimension of the particle i. Except for the above step, the procedure is equivalent to that of the standard PSO algorithm.

The Cooperative Approach
Generally, in population-based algorithms, including the PSO algorithm, an agent in one population represents an intact n-dimensional candidate solution. In the standard PSO algorithm, there is one swarm of N P particles, each of which has n components, that attempts to find an optimal n-dimensional solution. However, in this case, the algorithm frequently undergoes the "two steps forward, one step backward" phenomenon described in [20], especially when it is solving a high-dimensional problem. This appearance of this phenomenon means that although the fitness of a particle (or a candidate solution vector) may be considerably improved during the next time step, some of its components may have been changed from a better value to a rather poor value. Indeed, an improvement in two components (two steps forward) overrules a potentially good value for a single component (one step backward) for a problem with a three-dimensional solution vector. Eventually, valuable information is unintentionally lost.
One solution to the "two steps forward, one step backward" problem is to evaluate the objective function more frequently, perhaps each time a component in the candidate solution vector is updated, which results in much quicker feedback. However, in this case a problem remains. The function is only evaluated for a complete n-dimensional solution vector. Therefore, after a specific component is updated, the values of the n − 1 other components of the candidate vector still need to be chosen. One method of overcoming these problems proposed in [20], the Cooperative PSO (CPSO) algorithm, employs a cooperative approach. In the CPSO algorithm, unlike the standard PSO algorithm, the solution vector is split into its components so that K swarms of N P particles containing n/K -dimensional or n/K -dimensional components, where K is a pre-determined parameter called the split factor, are optimized. This approach effectively increases the solution diversity and the amount of information exchanged to avoid the "two steps forward, one step backward" phenomenon.

The Jumping Strategy
When variants of the PSO algorithm are applied to optimization problems with many local optima in a high-dimensional search space, they may become stuck at the local optima. The jumping strategy [17,[21][22][23] was proposed to escape from local optima, and promising results have been obtained. This strategy has been implemented as a mutation operator in Evolutionary Algorithms (EAs) based on a Gaussian and Cauchy probability distributions.
The goal of the jumping strategy is to allow particles in PSO algorithms to escape from local optima to which they are prematurely attracted. The motion of the particles in this situation stagnates with no improvement in their fitness. Whether a particle is stagnating can be determined by monitoring its fitness, and then, the stagnating particles move to a new point that is selected using the jumping strategy. This aim can be accomplished by introducing a stagnation interval (C F,j for each particle in this paper), which monitors the fitness of each particle, and increasing it by one during each iteration until its value reaches a pre-determined maximum number of iterations, which is called the maximum stagnation interval in [17] (M F in this paper).
When the particle should jump to a new point, its next position is determined by choosing between Gaussian and Cauchy jumps as follows: where the parameter η is for scaling and the vector p i (k) is the best known position of particle i. In Equation (6), N(0, 1) is a random number generated using a Gaussian probability distribution with a mean of 0 and a standard deviation of 1. In Equation (7), C(0, 1) is a random number generated using a Cauchy probability distribution with γ = 1 centered at the origin and described by

Heterogeneous Cooperative BBPSO with Jumping (HCBBPSO-Jx) Algorithms
Algorithm 2 shows detailed pseudo-code for the Heterogeneous Cooperative BBPSO with the jumping (HCBBPSO-Jx) algorithms proposed in this paper. The proposed algorithms consist of three main parts: a cooperative BBPSO step, a BBPSO with jumping step, and a cooperative part in which information is exchanged between two heterogeneous algorithms. First, the parameters for the HCBBPSO-Jx algorithms are initialized. The matrices x and y contain the current position vectors of the particles in swarms P and swarm Q, respectively. Additionally, the matrices p and q (the bottom matrices in Figure 2) include the p best information of swarms P and swarm Q, respectively. The vector g P (the top vectors in Figure 2) stores the g best information of swarms P and the vector g Q stores that of swarm Q.

The Cooperative BBPSO (CBBPSO) Step
Once the necessary parameters for the HCBBPSO-Jx algorithms have been initialized, the Cooperative BBPSO (CBBPSO) step is performed. This step introduces the aforementioned cooperative approach. It has been modified for use in high-dimensional problems as shown in lines 7-18 of Algorithm 2. Unlike the CPSO algorithm [20], the HCBBPSO-Jx algorithms are based on the BBPSO algorithm instead of the traditional PSO algorithm, i.e., they use the equation of motion from the BBPSO algorithm, Equation (3), to move the particles. This method is simpler and more robust than that of the PSO algorithm. Additionally, the algorithm uses the P swarms, which are K swarms of N P particles, as shown in Figure 2a. Here, the constant K is a pre-determined parameter called the split factor, as it is in the CPSO algorithm. From the constant K, the parameter K 1 is calculated as K 1 = mod(n, K). Then, the parameter K 2 is K − K 1 . Of the K P swarms, K 1 contain particles that have K C dimensions, where K C = n/K . The particles in the remaining K 2 swarms have K F -dimensional components, where K F = n/K . That is to say, an n-dimensional solution vector is divided into In addition, to reduce the Number of Function Evaluations (NFEs), the proposed algorithm uses a double if statement for updates (lines 10-15 of Algorithm 2).
As in [20], for cooperation (or, more precisely, information exchange) between swarms, a "blackboard," which is a shared memory in which particles can post or read hints, is used. To establish this blackboard, the algorithm introduces a context vector, shown at the top of Figure 2a, which selects the globally best particle from each of the K P swarms, and is used to evaluate the particles. To evaluate all of the particles in the s-th swarm, the other n − K X components in the context vector are kept constant while the 1 + (s − 1)K X -th to the sK X -th components of the context vector are replaced by each particle from the s-th swarm in turn. The function B(s, z s ) shown (and in lines 10 and 12 of Algorithm 2) plays this role to create an n-dimensional vector that is evaluated.
The subscript X of K X is determined as follows: for the s-th swarm, if s ∈ {1, · · · , K 1 } then The configurations of g best and p best for the HCBBPSO-Jx algorithms: (a) swarms P: K swarms of N P particles; in the cooperative BBPSO (CBBPSO) step, the P swarms consist of K 1 n/K -dimensional swarms and K 2 n/k -dimensional swarms, i.e., K = K 1 + K 2 where K 1 = mod(n, K); (b) swarm Q: an n-dimensional swarm of N Q particles for the BBPSO algorithms with Jumping (BBPSO-Jx) (one particle = one row vector).

The BBPSO with Jumping (BBPSO-Jx) Step
As the second step, to achieve robust heterogeneous cooperation for solving high-dimensional problems, the proposed HCBBPSO-Jx algorithms introduce the BBPSO algorithm with a jumping strategy. Figure 2b displays the configuration of the g best and p best of the swarm Q, which is the general form of the PSO variants. The standard BBPSO algorithm may become stuck at a local optimum when solving a high-dimensional problem with many local optima. In this case, as mentioned in Section 2.4, the jumping strategy enables particles to escape from the local optima. If there is no improvement for the j-th particle, then, the update failure counter, C F,j , is increased by one during each iteration until it reaches the predefined maximum allowable number of update failures, M F . After reaching the maximum number of failures, M F , the particle jumps to a new point using Equation (6) or (7). After the performances of two cases, which use Gaussian and Cauchy probability distributions, have been compared, the best jumping strategy for the proposed HCBBPSO-Jx algorithms is identified in Section 4.
Set parameters K: split factor, M F : # of maximum allowable update failure, η: jump scaling factor, n: dimension of given problem 2: k = 1; For spliting into solution vector components for CBBPSO 4: Initialize swarms P and swarm Q , C F : Counter vector for update failure 5: while Termination condition does not meet do 6: Update p best components (or particle) of swarms P 12: Update g best components of swarms P (K C or K F components of context vector) 14: end if 15: end if 16: end for 17:

Moving swarm using Equations (3)-(5) for swarms P 18: end for BBPSO with Jump
Step 19: for j = 1 to N Q do For each particle 20: if C F,j ≤ M F then Perform original BBPSO 21:

The Steps Involving Cooperation by Exchanging Information between the CBBPSO and BBPSO-Jx Algorithms
The final part of the HCBBPSO-Jx algorithms is where heterogeneous cooperation between the CBBPSO step and the BBPSO with jumping step occurs in the HCBBPSO-Jx algorithms; information from the previous two steps is exchanged, as shown in lines 26-30 and 40-43 of Algorithm 2. In this step, information is exchanged once per iteration when n components that are randomly selected from the matrices x and y, each corresponding to a component of its g best , are substituted. This step helps increase the diversity of the solutions searched.
Like other variants of the PSO algorithm, the above three-step process is repeated until a termination criterion is met, e.g., the maximum number of iterations or a solution with an acceptable objective function value is reached.

Comparative Simulations
This section presents the results of the simulations and a discussion of comparable simulations performed using five variants of the PSO algorithm. The goal is to verify the performance of the proposed HCBBPSO-Jx algorithms by applying them to the 20 1000-dimensional benchmark functions from the CEC 2010 Special Session. The CEC 2010 Special Session on Large-Scale Global Optimization (LSGO) identified 20 benchmark problems [24] that provide convenience and flexibility for comparing various optimization algorithms that are specifically designed for large-scale global optimization. The test suite includes four types of high-dimensional problem: (1) separable functions; (2) partially separable functions, which have a small number of dependent variables, and all of the remaining variables are independent; (3) partially separable functions that consist of multiple independent subcomponents, each of which is m-nonseparable; and (4) fully nonseparable functions. The detailed mathematical formulas for and properties of these functions are described in [24]. Section 4.1 describes the simulation environment and setup. Section 4.2 describes the comparative results evaluated using the Formula One point system, which is the method used in the LSGO challenge posed in the CEC 2010 competition. Finally, Section 4.3 is dedicated to reporting the results of the best algorithm tested in the comparative simulations using the method that represented the results of the LSGO competition.

The Simulation Environment and Setup
The simulations were conducted using the 20 1000-dimensional minimization problems from CEC 2010. For each problem, the simulation was run 25 times for statistical accuracy. The simulation terminated when the maximum number of function evaluations (MaxNFEs), which was set to 3 × 10 6 for all of the trials, was reached. The simulator and algorithm used in this simulation were implemented in MATLAB for 32-bit Windows 8.1. All of the simulations were performed on four computers with an Intel a 3.07 GHz processor and 4 GB of RAM. For fairness, each computer only ran simulations for the following pre-assigned functions: For comparison, the five variants of the PSO algorithm included the two groups of algorithms shown in Table 1. The first group included three algorithms that are associated with the proposed HCBBPSO-Jx algorithms: The cooperative PSO-H K (CPSO-H K ) algorithm, which was the most robust of the CPSO variants proposed in [20] and was obtained by combining the CPSO-S K algorithm with the PSO algorithm and Bare-Bones PSO with Cauchy and Gaussian jumping (BBPSOjumpC and BBPSOjumpG, respectively) algorithms [17], which improved the ability of the BBPSO algorithm to escape from local optima. The second comprised well-known variants of the PSO algorithm. The Adaptive PSO (APSO) algorithm [25] used a parameter adaptation scheme and elitist learning strategy to improve the PSO algorithm. The Comprehensive Learning PSO (CLPSO) algorithm [26] improved the diversification ability of the PSO algorithm by using comprehensive learning, in which all of the historically best information on the other particles was used to update a particle's velocity. All of the algorithms used the same parameters in all of the simulations; these are shown in the table.
The parameters for each algorithm were assigned the values that resulted in the best performance and that were recommended in the literature. In addition, to achieve a fair test, the initial population size was set to 50 for each algorithm. All of the search parameters were initialized using a uniform random process within the search space.  Table 2 shows the results of the simulations performed by the variants of the PSO algorithm after evaluation using the scoring system from the CEC 2010 LSGO Challenge. The scoring system used in the CEC 2010 LSGO Challenge was as follows: for each algorithm, a table of the type shown in Table 7 that contains 300 competition categories was formed. The competition categories were 20 functions ( f 1 ∼ f 20 ), 3 limits on the NFEs (1.2 × 10 5 , 6.0 × 10 5 , and 3.0 × 10 6 ), and 5 statistical values (best, median, worst, mean, and standard deviation) at each limit on the NFEs for each of the 25 runs. Then, the LSGO Challenge applies the Formula One point system to the data from the challenge participants in each of the 300 categories. Table 3 shows the points for each ranking in the Formula One point system. The winner received 25 points and other rankers received differentiated points according to ranking. Like the CEC 2010 LSGO Challenge, in this simulation, the smaller measured values in all of the categories, the higher the ranking and the more points. In particular, a small standard deviation means that the performance was more reliable. Eventually, the participant with the highest total score wins. In the results of the evaluation using this scoring system, the proposed HCBBPSO-JG algorithm was the best of the participating algorithms. In addition, the HCBBPSO-JG algorithm performed better than the HCBBPSO-JC algorithm that was proposed alongside it.  Tables 4-6 provide the results of the 25 runs of each variant of the PSO algorithm used in the comparison process when the NFEs counter reached = 3.0 × 10 6 for the 20 functions. We conducted several statistical hypothesis tests on these results. First, we performed the Friedman rank test on the data from all of the algorithms. Next, if there was a significant difference at the 5% significance level, we performed the Wilcoxon signed-rank test on the best algorithm, i.e., the HCBBPSO-JG algorithm, as well as the other algorithms and marked the results of each statistical significance test with its p-value and sign in the tables; the sign "+" means that the HCBBPSO-JG algorithm was significantly better than the algorithm compared to it, the sign "∼" means that the two were not significantly different, and the sign "−" means that the HCBBPSO-JG algorithm was significantly worse than the algorithm compared to it. The results of the Friedman rank test show that there were significant differences in the data from all of the algorithms at the p-value ≈ 0 0.05 significance level; therefore, the Wilcoxon signed-rank test was performed on the HCBBPSO-JG algorithm and the other algorithms. In the tables, the measured value of each function written in bold in colored cells represents the value of the best algorithm for each statistical value.
In these comparative results, the proposed HCBBPSO-Jx algorithms performed better than the other algorithms; they won for a total of 12 functions in terms of the mean NFEs, which is similar to the results obtained using the CEC 2010 LSGO Challenge scoring system. In particular, the HCBBPSO-JG algorithm stayed ahead of the HCBBPSO-JC algorithm with significant differences in the three functions f 7 , f 16 , and f 17 . However, the HCBBPSO-Jx algorithms were the weakest of the variants of Rastrigin's function, i.e., f 2 , f 5 , f 10 , and f 15 . The cause of this phenomenon can be understood by examining the convergence curves shown in Figure 3. We know from the curves for f 2 , f 5 , f 10 , and f 15 that the HCBBPSO-JG algorithm prematurely converged to a local optimum only for the functions that were connected to Rastrigin's function. The first cause for this result was to simulate all algorithms using the same total number of particles in order to ensure fairness in comparison simulation. The total number of particles directly affects the increase or decrease of NFEs (Number of Function Evaluations) of an algorithm because one particle must use at least one function evaluation to be evaluated for the solution it finds itself. Therefore, it cannot be a fair comparison if simulations are performed with different particle size. The second reason for these results is that the optimal parameter value for the parameter split factor K, which governs the performance of the HCBBPSO-Jx algorithms, was not used. Selecting this parameter K is very hard because it depends on the problems that want to be solved. In particular, the problems to solve with the algorithms proposed in this paper are high-dimensional problems with high complexity. Therefore, the issue of selecting optimal parameter values for the proposed HCBBPSO-Jx algorithms is to be left behind for further in-depth research later as the further works of this paper. Table 4. A comparison of the results of simulations performed by variants of the PSO algorithm. The best, median, worst, mean, standard deviation, p-value, and significance sign of the 25 runs when the NFEs counter reached 3.0 × 10 6 for the functions f 1 ∼ f 7 are reported. The p-value was determined using the Wilcoxon signed-rank test between the best algorithm (HCBBPSO-JG) and the others. The significance sign "+" means that the HCBBPSO-JG algorithm was significantly better than the algorithm compared to it, the sign "∼" means that the two were not significantly different, and the sign "−" means that the HCBBPSO-JG algorithm was significantly worse than the algorithm compared to it. The measured value written in bold in colored cells for each function represents the value of the best algorithm for each statistical value.

Conclusions
This paper proposed heterogeneous cooperative BBPSO algorithms that used a jumping strategy, which was strengthened for use with high-dimensional optimization problems by combining an improved exploration ability, which was introduced by means of heterogeneous cooperation and the jumping strategy, with the merits of BBPSO, which is simple but robust because it does not need to consider the selection of controllable parameters for the PSO algorithm and because its performance is not affected by the values of these parameters.
In the comparative simulations based on the 20 qualified benchmark functions and the evaluation system used in the CEC 2010 LSGO Challenge, the HCBBPSO-Jx algorithms provided improved results for most of the functions; notably, the HCBBPSO-JG algorithm performed the best according to the overall evaluation criteria. Although the proposed algorithms converged prematurely for several benchmark functions that were related to Rastrigin's function, they will be improved in future work. Therefore, the results of this study lead us to conclude that the proposed HCBBPSO-JG algorithm is useful as an optimizer for solving high-dimensional problems. In future work, the proposed algorithm will be improved for Rastrigin's function, tested with additional benchmark functions and compared with other state-of-the-art optimizers. We will also study the parameter split factor K, because it is very hard to select K depending on the problem. In particular, the problems to solve with the algorithms proposed in this paper are high-dimensional problems with high complexity. In addition, the further study will also be conducted on other parameters to improve the performance of the proposed algorithm.