Optimal Randomness in Swarm-based Search

Swarm-based search has been a hot topic for a long time. Among all the proposed algorithms, Cuckoo search (CS) has been proved to be an efficient approach for global optimum searching due to the combination of L\'{e}vy flights, local search capabilities and guaranteed global convergence. CS uses L\'{e}vy flights which are generated from the L\'{e}vy distribution, a heavy-tailed probability distribution, in global random walk to explore the search space. In this case, large steps are more likely to be generated, which plays an important role in enhancing the search capability. Although movements of many foragers and wandering animals have been shown to follow a L\'{e}vy distribution, investigation into the impact of different heavy-tailed probability distributions on CS is still insufficient up to now. In this paper, four different types of commonly used heavy-tailed distributions, including Mittag-Leffler distribution, Pareto distribution Cauchy distribution, and Weibull distribution, are considered to enhance the searching ability of CS. Then four novel CS algorithms are proposed and experiments are carried out on 20 benchmark functions to compare their searching performance. Finally, the proposed methods are used to system identification to demonstrate the effectiveness.

* Corresponding author: ychen53@ucmerced.edu probability distributions rather than Lévy flights into it, where the Weibull distribution performs best in enhancing the search ability of CS. At last, an application problem of parameter identification of unknown fractional-order chaotic systems is further considered. The corresponding experiments are performed on the fractional-order financial system. Based on the observations and results analysis, the randomness-enhanced CS algorithms are able to exactly identify the unknown specific parameters of the fractional-order system with better effectiveness and robustness, and CSP together with CSW may be treated as a useful tool for handling the problem of parameter identification, and can be regarded as an efficient and promising tool for solving the real-world complex optimization problems besides the benchmark problems.
The remainder of this paper is organized as follows. The principle of the original CS algorithm is described in Section 2. Section 3 gives details of four randomness-enhanced CS algorithms after a brief review of several commonly used heavy-tailed distributions. Experimental results and discussions of randomness-enhanced CS algorithms are presented in Section 4. Finally, Section 5 summarizes the conclusions and future work.

Cuckoo Search Algorithm
Cuckoo search (CS), recently developed by Yang and Deb, is considered to be a simple but very promising stochastic nature-inspired swarm-based search algorithm [2,9]. CS is inspired from the intriguing brood parasitism behaviors of some species of cuckoos, and is enhanced by Lévy flights instead of simple isotropic random walks. Cuckoos are considered as fascinating birds not only for the beautiful sounds but also for their aggressive reproduction strategy. Some species of cuckoo lay their eggs in host nests, and at the same time they may remove host bird' eggs in order to increase the hatching probability of their own eggs. There are three idealized rules [2] which can help describing CS with simplicity: (1) Only one egg is laid by each cuckoo bird at a time, and dumped in a randomly chosen nest; (2) The next generations of cuckoos search for new solutions using the best nests with high-quality; (3) The number of available host nests is fixed, and the egg laid by a cuckoo is discovered by the host bird with a probability P a ∈ [0, 1]. In this condition, the host bird can either remove the egg, or simply abandon the nest and build a completely new nest.
The purpose of CS is to substitute a not-so-good solution in the nests with the new and potentially better solutions (cuckoos). At each iteration process, CS employs a balanced combination of a local random walk and the global explorative random walk under control of a switching parameter P a . A greedy strategy is used after each random walk to select better solutions from the current and new generated solutions based on their fitness.

Lévy Flights Random Walk
The foraging pattern of the cuckoos is governed by an important factor known as Lévy flights [10]. Lévy flight models a random walk for large steps, where the step-length is drawn from a heavy-tailed probability distribution.
Besides, CS with Lévy flights based structured random walk has been demonstrated to perform more effective than many existing metaheuristics such as PSO, ABC, and DE [11].
At generation t, a global explorative random walk carried out by using Lévy flights can be defined as follows: where α > 0 is the step size related to the scales of the problem of interest, X best is the best solution obtained so far, the product ⊗ represents entrywise multiplications, and Lévy(λ) is defined according to a simple power-law formula as follows: where t is a random variable, 0 < λ ≤ 2 is a stability index. Moreover, it is worth mentioning that the well-known Gaussian and Cauchy distribution are its special cases when its stability index λ is respectively set to 2 and 1.
In practice, Lévy(λ) can be updated as follows: where λ is a constant number suggested as 1.5 [9], µ and v are random numbers drawn from a normal distribution with mean of 0 and standard deviation of 1, Γ(·) denotes the gamma function, and φ is expressed as:

Local Random Walk
The local random walk can be defined as: where X t j and X t k are two different selected randomly solutions, r and ǫ are two independent random numbers with uniform distribution, and H(u) is a Heaviside function. The local random walk utilizes a far field randomization to generate a substantial fraction of new solutions which are sufficiently far from the current best solution. The pseudo-code of the standard CS algorithm is given in Algorithm 1.

Randomness-Enhanced CS Algorithms
The standard CS algorithm uses Lévy flights in global random walk to explore the search space. The Lévy step is taken from the Lévy distribution which is a heavy-tailed probability distribution. In this case, a fraction of large steps are generated, which plays an important role in enhancing search capability of CS. Although many foragers and wandering animals have been shown to follow a Lévy distribution [8], investigation into the impact of other different heavy-tailed probability distributions on CS is still insufficient up to now. This motivates us to make an attempt to apply the well-known Mittag-Leffler, Pareto, Cauchy and Weibull distributions to the standard CS algorithm, by using which, more efficient searches are supposed to take place in the search space thanks to the long jumps. In this section, a brief review of several commonly used heavy-tailed distributions is given firstly, and then the scheme of the randomness-enhanced CS algorithms is introduced.

Commonly Used Heavy-Tailed Distributions
This section provides the definition of heavy-tailed distribution and several examples of commonly used heavytailed distributions.

Definition 1 (Heavy-tailed Distribution). A random variable is said to have a (right-) heavy-tailed distribution F
that is, if and only if F fails to possess any positive exponential moment. Otherwise, F is said to be light-tailed.
Example 1 (Mittag-Leffler Distribution). A Mittag-Leffler random number can be expressed using the most convenient expression proposed by Kozubowski and Rachev [13]: where γ is the scale parameter, u, v ∈ (0, 1) are independent uniform random numbers, and τ β is a Mittag-Leffler random number. For 0 < β < 1, the Mittag-Leffler distribution is a heavy-tailed generalization of the exponential, and reduces to the exponential distribution when β = 1.

Example 2 (Pareto Distribution). A random variable is said to have Pareto distribution if its cumulative distribu-
tion function has the following expression: where b > 0 is the scale parameter, a > 0 is the shape parameter (Pareto's index of inequality).

Example 3 (Cauchy Distribution). A random variable is said to have Cauchy distribution if its cumulative distri-
bution function has the following expression: where µ is the location parameter, σ is the scale parameter.

Example 4 (Weibull Distribution). A random variable is said to have Weibull distribution if it has a tail function F as follows:F
where κ > 0 is the scale parameter, ξ > 0 is the shape parameter. If and only if ξ < 1, the Weibull distribution is a heavy-tailed distribution.

Improving CS with Different Heavy-Tailed Probability Distributions
In order to search the global solution domain more effectively, four randomness-enhanced cuckoo search algorithms are proposed in this paper. Specifically, the following modified CS methods are considered: (1) CS with Mittag-Leffler distribution, denoted as CSML; (2) CS with Mittag-Leffler distribution, denoted as CSP; (3) CS with Cauchy distribution, denoted as CSC; (4) CS with Weibull distribution, referred to CSW. In the modified CS methods, the aforementioned four different heavy-tailed probability distributions are respectively used to be integrated into CS instead of the original Lévy flights in the global random walk. By using these heavy-tailed probability distributions, the updating equation (1) can be reformulated as follows where Mittag − Leffler(β, γ) in Equation (11) denotes a random number drawn from Mittag-Leffler distribution; Pareto(b, a) in Equation (12) represents a random number drawn from Cauchy distribution; Cauchy(µ, σ) in Equation (13) denotes a random number drawn from Cauchy distribution; Weibull(α, κ) in Equation (14) means a random number drawn from Weibull distribution. Compared with the standard CS algorithm, the differences of randomness-enhanced cuckoo search methods lie in the line 8 from Algorithm 1. In details, a new solution is generated using different heavy-tailed probability distributions according to Equations (11) (12) (13) (14). Besides, the jump lengths of CS, CSML, CSP, CSC and CSW (namely, α⊗Lévy(λ), α⊗MittagLeffler(β, γ), α⊗Pareto(b, a), α ⊗ Cauchy(µ, σ) and α ⊗ Weibull(ξ, κ)) are depicted in Figure 1, where the parameters are given in Table 1

Experimental Results
The focus of this study is to discuss the effectiveness and efficiency of the proposed randomness-enhanced CS algorithms. To fulfill this purpose, lots of experiments are carried out on a test suite of 20 benchmark functions, which are chosen from the literatures [14,15]. The superiority of randomness-enhanced CS algorithms over the standard CS is tested, and then advantages of applying four different heavy-tailed probability distributions into CS are also investigated.

Experimental Setup
For parameter settings of CS, CSML, CSP, CSC and CSW, the probability P a is set to 0.25 [2], the scaling factor α is set to 0.01. The proposed randomness-enhanced CS algorithms introduce new parameters to CS: the scale parameter γ and the Mittag-Leffler index β in CSML; the scale parameter b and the shape parameter a in CSP; the location parameter µ and the scale parameter σ in CSC; the scale parameter κ > 0 and the shape parameter ξ in CSW. As for these newly introduced parameters, their values are given in Table 1 after an analysis in Section 4.2.
Moreover, the population size satisfies NP = D where D denotes the dimension of the problem as similar done in [14], unless a change is mentioned. In the experimental studies, Max FEs is taken as the termination criterion and set to 10, 000 × D. All the algorithm are evaluated for 50 times and the averaged experimental results are recorded for each benchmark function respectively. Besides, two non-parametric statistical tests for independent samples are taken to detect the differences between the proposed algorithm and the compared algorithms. The tests contain the Wilcoxon signed-rank test at the 5% significance level and the Friedman test. And the symbol " ‡", " †" and "=" respectively denotes the average performance gained by the chosen approach is weaker than, better than, and similar to the compared algorithm. Meanwhile, the best experimental results for each benchmark problem are marked in boldface, for clarity. The user-defined parameter values for all the randomness-enhanced CS algorithms are listed in Table 1.

Performance Evaluation of Randomness-Enhanced CS Algorithms
In this section, lots of experiments are performed in order to probe into the effectiveness and efficiency of different heavy-tailed probability distributions on the performance of CS, and meanwhile, to decide the optimal randomness in improving CS. In our experiments, the standard CS and four proposed randomness-enhanced CS algorithms (namely, CSML, CSP, CSC and CSW) are tested on 20 test functions when D set to 30. The experimental results are presented in Table 2. According to Moreover, the comprehensive ranking orders are CSW, CSC, CSML, CSP and CS in descending manner. This indicates that the integration of different heavy-tailed probability distributions into CS not only retains the merit of CS, but also performs even better. Besides, the Weibull distribution performs best in enhancing the search ability of CS, that is, CSW is supposed to be the optimal randomness in improving CS among all the comparison methods for solving benchmark problems. To further discuss the convergence speed of the four randomness-enhanced CS algorithms, several test problems (namely F sph , F grw , F 1 and F 10 ) at D = 30 are selected to plot the convergence curves of the averages of the function error values within Max FEs over 50 independent runs, which are presented in Figure 3. From Figure 3, it can be observed that CSML, CSP, CSC and CSW converge outstandingly faster than CS according to the convergence curves. In summary, it can be concluded that the standard CS algorithm can be improved by integrating different heavy-tailed probability distributions rather than Lévy distribution into it.

Application to parameter Identification of Fractional-Order Chaotic Systems
In this section, the four proposed randomness-enhanced CS algorithms (namely, CSML, CSP, CSC and CSW) are applied to identify unknown parameters of fractional-order chaotic systems, which is an critical issue in chaos control and synchronization. Our main task of this section is to further demonstrate that improving CS with different heavy-tailed probability distributions can also effectively tackle the real-world complex optimization problems besides the benchmark problems. In fact, by using an non-Lyapunov way according to problem formulation suggested in [16], the nonlinear function optimization can be converted to from parameter identification of uncertain  fractional-order chaotic systems.
In the numerical simulation, the fractional-order financial system [17] under the Caputo definition is taken for example, which can be described as where q 1 , q 2 , q 3 and a, b, c are fractional orders and systematic parameters. When (q 1 , q 2 , q 3 ) = (1, 0.95, 0.99), (a, b, c) = (1, 0.1, 1), and initial point (x 0 , y 0 , z 0 ) = (2, −1, 1), the system above is chaotic. Figure.  The validation of proposed method in this paper are further proved by comparing CSML, CSP, CSC and CSW with the standard CS algorithm for solving parameter identification. In the simulations, parameter settings are given as: maximum iteration number is set to 200 and the population size is set to 40. For the system to be identified, the step size is set to 0.005, and the number of samples set to 200. In addition, it is worth mentioning that the same computation effort is used in implementation for all the compared algorithms to make fair comparison. Table 3 lists the statistical results of the average identified values, the corresponding relative error values, and the objective function values for system (15). From Table 3 Moreover, Figure 5 shows In terms of Figure 5(d), the objective function values of CSML, CSP, CSC, CSW also decline faster than CS, and among which CSP has best performance. It is noteworthy that CSW has similar convergence curve of objective function values with CSP, and can converge to the nearby area of CSP. Therefore, CSW can still be considered as a very efficient for solving optimization problems.
According to the foregoing discussion, it can be summarized that the randomness-enhanced CS algorithms are able to exactly identify the unknown specific parameters of the fractional-order system (15) with better effectiveness and robustness, and CSP together with CSW may be treated as a useful tool for handling the problem of parameter identification.   Figure 5. The convergence curves of the relative error values and objective function values for system (15).

Conclusions
In this paper, we mainly focus on the discussion on the impact of different heavy-tailed distributions on the performance of CS. Then by replacing Lévy flights with steps generated from other heavy-tailed distributions in CS, four different randomness-enhanced CS algorithms (namely CSML, CSP, CSC and CSW) are presented by applying Mittag-Leffler, Pareto, Cauchy and Weibull distributions, in order to improve the optimization performance of CS. The improvement in effectiveness and efficiency is validated through dedicated experiments. The experimental results indicate that all the four proposed randomness-enhanced CS algorithms show an significant improvement in effectiveness and efficiency over the standard CS algorithm. Besides, the Weibull distribution performs best in enhancing the search ability of CS, that is, CSW is supposed to be the optimal randomness in improving CS among all the comparison methods for solving benchmark problems. Furthermore, the randomness-enhanced CS algorithms are successfully applied to identify unknown specific parameters of uncertain fractional-order financial chaotic systems. CSP together with CSW may be treated as two best choices for handling the problem of parameter identification. In summary, CS with different heavy-tailed probability distributions can be regarded as an efficient and promising tool for solving the real-world complex optimization problems besides the benchmark problems.
Regarding to the future work, we plan to make further improvements to randomness-enhanced CS algorithms to make it comparable to other state-of-the-art algorithms.