A Novel Optimization Algorithm Combing Gbest-Guided Artificial Bee Colony Algorithm with Variable Gradients

: The artificial bee colony (ABC) algorithm, which has been widely studied for years, is a stochastic algorithm for solving global optimization problems. Taking advantage of the information of a global best solution, the Gbest-guided artificial bee colony (GABC) algorithm goes further by modifying the solution search equation. However, the coefficient in its equation is based only on a numerical test and is not suitable for all problems. Therefore, we propose a novel algorithm named the Gbest-guided ABC algorithm with gradient information (GABCG) to make up for its weakness. Without coefficient factors, a new solution search equation based on variable gradients is established. Besides, the gradients are also applied to differentiate the priority of different variables and enhance the judgment of abandoned solutions. Extensive experiments are conducted on a set of benchmark functions with the GABCG algorithm . The results demonstrate that the GABCG algorithm is more effective than the traditional ABC algorithm and the GABC algorithm, especially in the latter stages of the evolution.


Introduction
As computer technology gains momentum, more and more researchers develop and apply optimization algorithms to solve optimization problems. These algorithms can be broadly divided into gradient-based and gradient-free (or stochastic) optimization algorithms. Gradient-based optimization algorithms search along the gradient direction. They are of high convergence rate and are particularly suitable for solving problems with large design space [1]. However, their optimization results are close to the extreme point near the point of the initial parameter, rather than the maximum point of the entire design space [2]. On the contrary, gradient-free optimization algorithms are robust and quite effective for solving multi-modal optimization problems [3]. Such algorithms can be easily integrated into different optimization designs, but they are computationally expensive, especially in cases of numerous design parameters [4]. Due to the feature of their random search, these stochastic algorithms tend to converge slowly [5], especially near the area of global optimum. Therefore, it is significant to improve the convergence of stochastic algorithms through some modifications.
First proposed by Karaboga [6], the artificial bee colony (ABC) algorithm is a gradient-free optimization algorithm that is widely used in multi-variable and multi-objective optimizations [7]. Karaboga's further research [8,9] shows that the ABC algorithm outperforms other stochastic algorithms, such as the genetic algorithm (GA), particle swarm algorithm (PSO), and particle swarm inspired evolutionary algorithm (PS-EA). However, the ABC algorithm is good at exploration but poor at exploitation [10]. Integrating gradient-based searching method into evolution algorithms may improve exploitation [11][12][13][14]. In recent years, some novel algorithms based on gradient information have been brought forward to modify the ABC algorithm. For example, a hybrid algorithm called ABC-SQP was developed by Eslami [15]. It combines the artificial bee colony algorithm with sequential quadratic programming. Another algorithm, the gradient-based artificial bee colony algorithm (GdABC), was brought forward by Juan et al. [16] to improve the local search ability. Both ABC-SQP and GdABC use the global exploration capability of the ABC algorithm and the exploitation capability of gradient-based local search algorithm. However, in these hybrid algorithms, the search equation of the original program is not modified, and the procedures of the random searching algorithm and the gradient-based local search algorithm are independent of each other. Thus, the program parts of a gradient optimization still tend to fall into the local optimum.
Other researchers have improved the search strategy of ABC algorithm by modifying the search equation according to the information of best solutions [17][18][19][20]. For example, Zhu and Kwong [21] found that the global best solution could help improve exploitation. They proposed the Gbest-guided artificial bee colony search (GABC) algorithm. Due to the great advantage of the convergence property of GABC, many GABC-based algorithms have been developed [22][23][24][25][26][27]. For example, Guo et al. [22] incorporated the information of current iteration best solution into solution search equation and introduced global artificial bee colony search algorithm (GABCS). In the original GABC algorithm, the convergence performance is improved by adding a coefficient factor to the solution search equation. However, the selection of the coefficient factor is just based on testing. In the study of Zhu and Kwong, the best value of the coefficient factor they chose is not suitable for all problems. Therefore, the theoretical analysis and the conclusion of the GABC algorithm are inadequate. It is necessary to make up for this weakness.
We present an idea of establishing a new solution search equation leaving out the coefficient factor and introducing variable gradients. Since the gradient magnitude characterizes the effect of variable change on the solution, we also apply it to differentiate the priority of different variables and enhance the judgment of local optimum, simultaneously. In this paper, we present a novel optimization algorithm combing Gbest-guided artificial bee colony algorithm with variable gradients (the Gbest-guided ABC algorithm with gradient information-GABCG). This novel algorithm perfects the theory of the Gbest-guided artificial bee colony algorithm with a modified solution search equation. It accelerates the convergence rate of optimization algorithm by giving priority to variables with greater gradients and enhances the judgement on abandoned solutions with variable gradients to avoid premature abandonment of solutions. To validate the advantages of this new hybrid algorithm, a set of 10 well-known benchmark functions are considered to compare the performance of the traditional ABC, GABC, and GABCG algorithms. The experiment results demonstrate that the proposed algorithm is more effective than these two ABC-based algorithms, especially in the latter stages of the evolution. Finally, the characteristics of the improvement measures are discussed, and future research sites are indicated.

Overview of Artificial Bee Colony Algorithm
The ABC algorithm is a gradient-free algorithm inspired by the intelligent foraging behavior of a bee colony [22]. In this algorithm, there are three groups of artificial bees, including employed bees, onlookers and scouts [6]. Employed bees search food around the food source they have found. Onlooker bees, waiting at the dance area, take charge of selecting good food sources from those found by the employed bees. Scouts carry out random searches to expand the scope of exploration.
At the beginning of the algorithm process, the food solutions of employed bees are randomly initialed by Equation (1). These solutions and their nectar amounts are recorded as the first generation "food sources." Then, the following steps are repeated until the best food source meets the colony's requirements [28]: 1. Every employed bee collects nectar at food sources according to Equation (2) [29]; 2. After food sources are selected in light of probability value (pi in Equation (3), which is in direct proportion to nectar amount of food source), every artificial onlooker collects nectar at new food sources, which are generated according to Equation (2) [30]; 3. Employing greedy selection, the "food sources" in each iteration are updated by employed bees and onlookers; 4. If the nectar amount of a food source has not increased more than "limit" (a control parameter) times, it will be abandoned, and a scout will search for a new food source according to Equation (1). When the algorithm is terminated, the optimal food source solution and its nectar amount are outputted.
where i∈{1, 2,…, N}, j∈{1, 2,…, D}, N stands for the number of food sources and D represents the number of variable elements. xij represents jth element of ith source solution, and Ψij is a random number in [0, 1].
where Φij is a random number in [−1, 1] and k is a randomly chosen index which is different from i.
where Fi is the fitness value of the solution i evaluated by its employed bee, which is proportional to the nectar amount of the food source at position i.

The Gbest-Guided ABC Algorithm with Gradient Information
As illustrated in the Introduction, Zhu and Kwong [20] incorporated the information of global best solution into the solution equation and developed a better version of the ABC algorithm called the Gbest-guided ABC (GABC) algorithm. They modified the ABC algorithm by replacing Equation (2) with Equation (4), thus driving the new candidate solution toward the global best solution. Their study shows that the new solution search equation described by Equation (4) can increase the exploitation of the ABC algorithm, and the parameter C plays an important role in balancing the exploration and exploitation of the searching for the optimal solution. For most of the results of their cases, it can be observed that as the value of parameter C increases, the optimization results get better at first but then begin to deteriorate after a certain value. In their numerical experiments, GABC algorithm with C = 1.5 shows the best performance among the tested programs.
where C is a non-negative constant parameter, Ψij is a random number in [0, 1] and yj is the jth element of the global best solution. Actually, in Equation (4), yj represents only the information of best solution found so far. There may be better solutions that have not been searched out by algorithm. In Figure 1, yj with negative gradient value at jth element is represented by yj ' . Taking yj ' as an example, if the value of C in Equation (4) is less than 1, the solutions generated by Equation (4) is closer to a solution on the left side of yj ' . In another case, if the value of C is much larger than 1, the solutions generated by Equation (4) is closer to a solution on the very right side of yj ' . But GABC algorithm prefers to search toward a small range (region A) on the right of yj ' because there are better solutions in this area. Therefore, we suggest that 1.5 may be a relatively appropriate value of C for the cases that Zhu and Kwong studied. Although the GABC algorithm can achieve good results when the parameter C is 1.5, it still has some inherent weakness. For example, in Figure 1, yj " has positive gradient value at the jth element and xij is less than yj " . The solutions generated by Equation (4) with C = 1.5 is closer to a solution on the right side of yj " , instead of better solutions on the left. Besides, because Zhu and Kwong only took a few discrete values for C, the optimal parameter value could not be accurately determined. In their calculations for the Rastrigin function, the program with C = 1 got better results than the program with C = 1.5. This means that C = 1.5 is not applicable to all optimization problems.
A more accurate estimation of the best solution may further improve the exploitation of algorithm. Since the GABC algorithm has these defects, we hope to propose new ideas on the solution search equation. As shown in Figure 1, better solutions may be on the left or right side of the current global best solution, which is related to the sign of the gradient. Therefore, we define the jth element of estimated best solution as follows: yj-ζij(yj-zj2)/2 (ζij is a random number in [0, 1] and zj2 is the rightmost solution of yj) when the gradient value of yj is negative, yj-ζij(yj-zj1)/2 (zj1 is the leftmost solution of yj) when the gradient value is positive and still yj when the gradient value is zero. Taking advantage of the estimated best solution, the solution search equation used by employed bees and onlookers is converted into Equation (5). Therefore, the next generations of employed bees and onlookers look for better routes toward the estimated best solution of current generation. It should be noted that some methods can be employed to calculate the gradient of optimization problems, such as discrete estimation method (Equation (6) is used in this study) and the adjoint method [1,4].
The gradient values of parameters contain two kinds of information: symbol and magnitude. The sign can be used to judge the direction of optimization, and the magnitude can determine the effect of variable changes on objective functions. With the help of the sign of the gradient, the above measure (Measure 1) ensures that the random search goes toward a solution that is better than the current global best solution. In addition, the gradient magnitude is applied in the following two other improvement measures: Measure 2: In the ABC algorithm, an onlooker chooses a food source in light of probability value, which is determined by fitness. Similarly, in our newly proposed algorithm, when the global best solution is selected by an employed bee (or an onlooker), the variable element to be changed is determined by the probability value associated with the gradient. Detailed operations are as follows: Randomly selecting a variable element yk and a random number q in [0, 1]. If pgk in Equation (7) is greater than q, this element will be used for optimization; otherwise, re-random selection will be carried out until the judgment condition is met.
Measure 3: Since the gradient values of the solution near the vertex are relatively small, an abandoned solution can be judged not only by the control parameter (limitation of times) but also by the gradient magnitude. In our study, only when the fitness of a food source has not been improved more than limited times and all variable gradients of this source are less than 10 times maximum parameter gradient of the global best solution does the scout bee operate a random search.
Based on the above explanation of three improvement measures, the pseudo code of the GABCG algorithm is given as Table 1. Since the innermost loop body of the new algorithm has not changed, the complexity of each iteration does not greatly increase. if GlobalMin = = Fit(i): Randomly seletct a variable element j if rand < NormSen(j): Produce a new candidate solution by Equation (5) else: Reselect a variable element end else: Produce a new candidate solution by Equation (5) end else: Reselect a food source The whole search strategy adopted by the Gbest-guided ABC algorithm with gradient information (GABCG) is shown in Figure 2. In Measure 1, the sign of the gradient is applied to find a better solution than the current global best solution, so the exploitation is increased further in our proposed algorithm. In a standard ABC algorithm, the preference of a food source by an onlooker bee depends on the nectar amount of that food source. Inspired by this, the preference of a variable should depend on the gradient magnitude of that variable. Thus, through Measure 2, gradient values are used to evaluate the updating probability of variables, and the variables with a larger gradient have a greater chance of being selected to update. Besides, in the original algorithm, the scouts just control the exploration process through the predetermined number of trials "limit." While, in our algorithm, the gradient magnitude is used as another evaluation criterion. Measure 3 strengthens the judgment of the abandoned solution by the gradient limit, so that it is effective for accurately determining whether the search is trapped in the local optimum.

Benchmark Functions and Parameter Settings
In order to verify the validity of proposed GABCG algorithm, we conducted it on the maximum and minimum of functions. Some functions that are commonly used for optimization were chosen to confirm the effectiveness of the GABCG. ƒ1-ƒ10 are shown in Table 2, which denote Rosenbrock, Sphere, Schaffer, Rastrigin, Griewank, Ackley, Elliptic, SumSquares, Quartic, and Himmelblau function, respectively [17]. Different letters indicate different characteristics. U stands for unimodal, and M stands for multimodal. S stands for separable, and N stands for non-separable. In this study, the Rosenbrock, Sphere, Elliptic, SumSquares, and Quartic functions (ƒ1-ƒ5) with single-peak were used to verify the convergence accuracy and rate of the algorithm. Since many physical problems have multiple peaks, the optimization search may fall into a local optimum on the path towards the global optimum [5]. It is very important to test the optimization algorithm using multimodal functions. Therefore, the Himmelblau, Schaffer, Rastrigin, Griewank, and Ackley functions (ƒ6-ƒ10) with multi-peak were used to verify the global optimization capability of the algorithm.  (0)=0 f  In our experiments, the main control parameters are set as follows: The colony size is 80 (the number of employed bees is 40, the number of onlookers is 40, and the number of scout bees is one). The maximum number of generations is 5000. The times "limit" of abandoning the food source is 50. The parameter C of the GABC algorithm is set as 1.5. Schaffer function and Rosenbrock function were tested with 10 and 30 variable elements, Himmelblau function was tested with 100 and 200 variable elements, and the other functions were tested with 30 and 60 variable elements. Each experiment was repeated 30 times by the ABC, GABC, and GABCG algorithms.

Performance Comparison between ABC, GABC, and GABCG
Comparing the proposed algorithm with the other two algorithms on 10 benchmark functions, Table 3 shows the mean and standard deviations of each benchmark function results. The average convergence curves of representative cases are presented in Figure 3. It can be seen that the program using the GABCG algorithm gets better results on all 10 functions than the ABC algorithm and the GABC algorithm.   As seen from the results, the GABCG algorithm has great advantage of convergence property and applicability compared to the ABC and GABC algorithms. For the unimodal functions in Figure  3(1-5), the performances of the GABCG algorithm are both superior to the ABC algorithm and the GABC algorithm, which verifies the good convergence accuracy and rate of the GABCG algorithm. In Figure 3(6-10), it is more obvious to see the advantage of the GABCG algorithm in the accuracy of results for multimodal functions, which testifies the good global optimization capability of the GABCG algorithm. From these iterative curves, it can be seen that the GABCG algorithm shows better convergence performance than the GABC algorithm in the latter iterations.
Comparing the performance of the GABCG algorithm in unimodal functions and multimodal functions, it can be found that this algorithm is superior in solving most multi-peak problems. Especially on some complex problems, such as Rastrigin and Griewank functions, the results of both the ABC and GABC algorithms appear to be stagnant in the middle phases of evolution. However, the GABCG algorithm can keep finding better solutions during almost the entire search. This indicates that the GABCG algorithm successfully reduces unnecessary computation costs and determines the exact optimization paths by gradient acceleration.

Effects of the Colony Size on the Performance of GABCG
In order to analyze the effect of colony size on convergence speed and accuracy, the algorithm programs were run with different colony sizes. At first, we studied the three algorithms with 80, 160, and 240 colony sizes. Both the numbers of employed bees and onlooker are half of the colony size, and the number of scout bees is one. In this test, the programs were executed on five benchmark functions. Schaffer function and Rosenbrock function were tested with 30 variable elements, and the other functions were tested with 60 variable elements. Each experiment was repeated 30 times by the ABC, GABC, and GABCG algorithms.
The mean and standard deviations of optimization results are presented in Table 4. It can be found that as the colony size increases, the accuracy of optimization results become better, and our GABCG algorithm always outperforms the ABC and GABC algorithms. To make the effects on GABCG clear, we present the convergence curves of the GABCG algorithm with different colony sizes in Figure 4. As can be seen, there are no significant improvement from 80 to 240. However, in these tests, the running time of program is almost proportional to the colony size. Therefore, in the practical application of the algorithms, an excessively large population will consume a large amount of computing resources without greatly improving the accuracy of results. This is also found in Karaboga's study [3] on the traditional ABC algorithm, which found that a colony size of 50-100 can provide an acceptable convergence speed for search. It gives us the confidence to use a relatively small colony size for efficiency. Mean and SD donate the average and standard deviation of optimization results in 30 runs, and the best mean results are in bold.

Effects of Each Improvement Measure on the Performance of GABCG
To analyze the improvements respectively, we modify the GABC algorithm individually with each measure. The GABC with Measure 1 is called GABCG1, the GABC with Measure 2 is called GABCG2, and the GABC with Measure 3 is called GABCG3. We compared the convergence performance of different GABCGs on Elliptic and Rastrigin functions. The convergence curves are shown in Figure 5. It can be found that all these measures have a positive effect on convergence performance. For both functions, the dominant interval of the GABCG1 algorithm is located in the middle stage of evolution. For the multimodal Rastrigin function, the GABCG2 algorithm performs well in the first half of the evolution, but its performance becomes worse afterward, even worse than that of the original GABC algorithm. For the unimodal Elliptic function, the GABCG3 algorithm has a weak advantage in the middle stage but has a significant advantage in the latter stage. This illustrates that these improvement measures have different characteristics for solving different problems. In addition, it is important to note that the GABCG3 algorithm greatly outperforms GABC in the latter stage of these two curves, which means that premature abandonment of the solutions by the scout bees is not conducive to the optimizations of Elliptic and Rastrigin functions. This may be an important reason why the GABCG algorithm is particularly advantageous for some problems.

The Gradient Effect on ABC/Best/1 and ABC/Best/2
In addition to the GABC algorithm, with the help of the best solutions in the current population, Gao et al. [18] also proposed the ABC/best/1 and ABC/best/2 strategies. The best solutions are used to direct the movement of the exploration. Their search equations are devised as follows: where the indices k, l and m are different from i, C is a non-negative constant parameter, Ψij is a random number in [0, 1], and yj is the jth element of the global best solution. In order to clarify the effect of gradient information on these two algorithms, ABC/best/1G and ABC/best/1G are brought out by using the improvement strategy of Section 3. In Table 5, the ABC/best/1 is compared with ABC/best/1G, and ABC/best/2 is compared with ABC/best/2G on benchmark functions respectively. As seen from the left side of Table 5, the ABC/best/1G algorithm shows better performance than ABC/best/1 algorithm. Except for Rosenbrock and Griewank functions, the optimization results of the two algorithms for other functions are not significantly different. Therefore, the gradient-based measures have a small impact on the performance of the ABC/best/1 algorithm. From the right side of Table 5, the ABC/best/2G algorithm does not get smaller optimization results than ABC/best/2 algorithm on all functions. This means the improvement does not work for ABC/best/2 algorithm. The search strategies between the GABC algorithm and the ABC/best algorithms are different. The GABC algorithm searches for the new candidate solution toward the global best solution, but the ABC/best algorithms search for the new candidate solution around the global best solution. Actually, the original search equations of ABC/best algorithms always look for better results near the global best solution. Therefore, the advantages of this study's Measure 1 are not fully applicable to them. In addition, it can be clearly seen from the calculation results that the improvement on ABC/best/1 is significantly better than that on ABC/best/2.

Discussion
It can be observed from the trends of Figure 3 that the convergence rate of the GABCG algorithm is similar to the GABC algorithm in the first few steps but is slightly faster in the subsequent steps. In the early stages of the search, the sample points are not fully distributed throughout the search space, so the exploration of algorithm is more important than the exploitation, and thus the effect of gradient acceleration in the new algorithm is small and the convergence rate of the new algorithm is not better than that of the GABC algorithm. It should be noted that the gradient calculation increases the computation amount of the program. However, in our new algorithm, only gradients of current global best solutions and abandoned solutions are necessary, so it does not have a significant impact on the computational complexity of the whole algorithm. The advantage of gradient acceleration is not obvious in the early stage of random search. In order to further reduce computational costs, we recommend not changing the original algorithm in some steps at the beginning of the process.
From Zhu and Kwong's study [21], it can be concluded that driving the new candidate solution toward the global best solution can increase the exploitation of the ABC algorithm. In this research, since the gradient information not only provides a more accurate estimate of the global best solution for the GABC algorithm but also makes the sample screening more efficient, the exploitation is further enhanced. The findings of this paper can also apply to other GABC-based algorithms, such as SABC-GB [5], GABCS [22], MPGABC (a new GABC variant combing the novel search strategy with probability model) [26], WGABC (a linear weighted GABC algorithm) [27], and gdbABC (an improved GABC algorithm with gradient-based information) [31]. It is interesting that gdbABC controls the direction of search movement by a Newton-Raphson formula with the gradient information. To avoid premature convergence, it uses a distribution-based strategy in case of several local suboptimal solutions. From Qiu's research [31], it can be found that gdbABC does not perform well for the optimizations of Rosenbrock and Griewank functions. Therefore, in order to enhance the applicability of the gradient-based algorithm, it is necessary to balance exploitation and exploration.
In addition to gdbABC [31], ABC-SQP [15] and GdABC [16] also apply gradient information to Newton's method. As a result of searching along the gradient direction, they bring the results of new candidate solutions closer to the optimal value. However, since Newton's method cannot skip the local optimum [32], these gradient-based algorithms may perform poorly on complex multimodal problems. The GABCG algorithm do not change the original search strategy completely but estimate the global best solution more accurately and enhance the abandoned judgement with the gradient information. Therefore, our improvements are a constraint optimization for the original algorithm, while improving its convergence without losing its applicability. Inspired by this thought, our future work ought to focus on optimizing other efficient and adaptable algorithms with gradient information.

Conclusions
In this paper, we have proposed an improved Gbest-guided artificial bee colony algorithm with the gradient information, which is called the GABCG algorithm. The original algorithm is modified by variable gradients in the following three aspects: (1) The next generation of employed bees and onlookers look for better routes toward the estimated optimal solution based on gradient information. (2) The variables with greater gradient values are given priority to be improved. (3) The gradient threshold of variable element is added to the judgment of abandoned solutions. From the optimization results tested on benchmark functions, it is concluded that the proposed algorithm is more effective than the traditional ABC and GABC algorithms in the latter stages of the evolution. Especially for some complex multimodal problems, GABCG algorithm can keep finding better solutions during almost the entire search, so that it reduces unnecessary computation costs.
Through testing the programs with different colony sizes, it is found that increasing the colony size of artificial bees can slightly improve calculation accuracy. Therefore, both accuracy and computation cost should be taken into consideration for choosing an appropriate colony size. To clarify the contribution of the three measures used in this paper to the optimization process, the modified algorithms with each improvement measures are compared. The experiment results show that the first two measures play a significant role in the middle of evolution, and the last measure has a great impact on the latter stages. All these measures can improve convergence quality. Moreover, the research ideal of gradient acceleration is introduced into the ABC/best algorithms. It is found from comparative experiments that the effect of gradients on them is small, but the improvement of ABC/best/1 is better than that of ABC/best/2. This means that the performance of ABC/best algorithms is less related to accurate estimation of the optimal solution. Moreover, since the addition of gradient information can enhance search efficiency while retaining the original characteristic of the algorithm, it is desirable to improve more swarm intelligence methods by considering gradients.

Conflicts of Interest:
The authors declare that they have no conflicts of interest.