Global Evolution Commended by Localized Search for Unconstrained Single Objective Optimization

: Differential Evolution (DE) is one of the prevailing search techniques in the present era to solve global optimization problems. However, it shows weakness in performing a localized search, since it is based on mutation strategies that take large steps while searching a local area. Thus, DE is not a good option for solving local optimization problems. On the other hand, there are traditional local search (LS) methods, such as Steepest Decent and Davidon–Fletcher–Powell (DFP) that are good at local searching, but poor in searching global regions. Hence, motivated by the short comings of existing search techniques, we propose a hybrid algorithm of a DE version, reﬂected adaptive differential evolution with two external archives (RJADE/TA) with DFP to beneﬁt from both search techniques and to alleviate their search disadvantages. In the novel hybrid design, the initial population is explored by global optimizer, RJADE/TA, and then a few comparatively best solutions are shifted to the archive and reﬁned there by DFP. Thus, both kinds of searches, global and local, are incorporated alternatively. Furthermore, a population minimization approach is also proposed. At each call of DFP, the population is decreased. The algorithm starts with a maximum population and ends up with a minimum. The proposed technique was tested on a test suite of 28 complex functions selected from literature to evaluate its merit. The results achieved demonstrate that DE complemented with LS can further enhance the performance of RJADE/TA.

Traditional search approaches, such as Nelder-Mead algorithm, Steepest Descent and DFP [44] may be hybridized with DE to improve its search capability.Implementing LS into a global search for enhancing the solution quality is called Memetic Algorithms (MAs) [31,45].Some of the recent MAs can be found in [1,31].Very recently, Broyden-Fletcher-Goldfarb-Shanan LS was merged with an adaptive DE version, JADE [46], which produced the MA, Hybridization of Adaptive Differential Evolution with an Expensive Local Search Method [47].In the majority of the established designs, LS is implemented to the overall best solutions, while in our design it is applied to the migrated elements of the archive.In addition, the population is adaptively decreased.
In this work, we propose a hybrid algorithm that combines DFP [44,48,49] with a recently developed algorithm, RJADE/TA [50], to enhance RJADE/TA's performance in local regions.The main idea is to operate DFP on the elements that are shifted to archive and record the information from both solutions, the previously brought forward and the new potential solutions to discourage the chance of losing the globally best solution.For this purpose, firstly, DFP is implemented to the archived information.Secondly, a decreasing population mechanism is suggested.The new algorithm is denoted by RJADE/TA-ADP-LS.
The structure of this work is as follows.Section 2 presents primary DE, DFP, and RJADE/TA methods.Section 3 describes the literature review.In Section 4, the suggested hybrid algorithm is outlined.Section 5 is devoted to the validation of results achieved by RJADE/TA-ADP-LS.At the end, the conclusions are summarized in Section 6.

Primary DE, DFP, and RJADE/TA
We reviewed in detail traditional DE and JADE in our previous works [47,50].Here, we briefly review primary DE, DFP and RJADE/TA for ready reference.

Primary DE
DE [3,4] starts with a random population in the given search region.After initialization, a mutation strategy, where three different individuals from population are randomly selected and the scaled difference of the two individuals to the third one, target vector is added to produce a mutant vector.Following mutation, the mutant and the target vectors are combined through a crossover operator to produce a trial vector.At last, the target and trial vectors are compared based on a fitness function to select the better one for the next generation (see Lines 7-20 of Algorithm 1).
1: To form the primary population P p produce N [pop] vectors uniformly and randomly, Choose Produce the mutant vector w Produce the trial vector q {y} [i,j] as follows. 13: if i < i rand or rand(0, 1) < CR j then 15: end if If size of M [ f irst] > N [pop] , delete extra solutions from M [ f irst] randomly; 25: Update M [sec] as follows.by itself is migrated to the second archive M [sec] .RJADE/TA maintains two archives, termed as M [ f irst] and M [sec] for convenience.After half of available resources are utilized (MaxFEs), the first archive update of the second archive, M [sec] , is made.Afterwards, M [sec] is updated adaptively with a continuing intermission of generations (see Algorithm 1).
The overall best candidates are transferred to M [sec] , whereas M [ f irst] records the recently explored poor solutions.The size of M [ f irst] is fixed, equal to population size N [pop] , while the size of M [sec] may exceed N [pop] .As M [sec] keeps information of all best solutions found, no solution is deleted from it.M [sec] records only one solution of the current iteration, it may be a child or a parent, whereas M [ f irst] makes a history of more than one inferior "parent solutions" only.M [ f irst] is updated at every iteration and M [sec] , initialized as ∅, is updated with a gap of κ iterations adaptively.The recorded history of M [ f irst] is utilized in reproduction later on.In contrast, in M [sec] , the recorded best individual is reflected with a new solution, which is then sent to the population.Once a candidate solution is posted to M [sec] , it remains passive during the whole optimization.When the search procedures are terminated, then the recoded information contributes towards the selection of the best candidate solution.

Davidon-Fletcher-Powell (DFP) Method
The DFP method is a variable metric method, which was first proposed by Davidon [51] and then modified by Powell and Fletcher [52].It belongs to the class of gradient dependent LS methods.If a right line search is used in DFP method, it will assure convergence (minimization) [49].It calculates the difference between the old and new points, as given in Equation ( 1).Then, it finds the difference of the gradients at these points as calculated in Equation (2). (1) It then updates the Hessian matrix H as presented in Equation ( 3).Afterwards, it locates the optimal search direction s [j] with the help of the Hessian matrix information as calculated in Equation ( 4).Finally, the output solution w [j+1] is computed by Equation ( 5), where α [j] is calculated by a line search method; golden section search method is used in this work. (5)

Related Work
To fix the above-mentioned weaknesses of DE, many researchers merged various LS techniques in DE.Nelder-Mead LS is hybridized with DE [53] to improve the local exploitation of DE.Recently, two new LS strategies are proposed and hybridized iteratively with DE in [1,31].These hybrid designs show performance improvement over the algorithms in comparison.Two LS strategies, Trigonometric and Interpolated, are inserted in DE to enhance its poor exploration.Two other LS techniques are merged in DE along with a restart strategy to improve its global exploration [54].This algorithm is statistically sound, as the obtained results are better than other algorithms.Furthermore, alopex-based LS is merged in DE [55] to improve its diversity of population.In another experiment, DE's slow convergence is enhanced by combining orthogonal design LS [56] with it.To avert local optima in DE, random LS is hybridized [57] with it.On the other hand, some researchers borrowed DE's mutation and crossover in traditional LS methods (see, e.g., [58,59]).
To the best of our knowledge, none of the reviewed algorithms in this section integrate DFP into DE's framework.Further, the proposed work here maintains two archives: the first one stores inferior solutions and the second one keeps information of best solutions migrated to it by the global search.Furthermore, the second archive improves the solutions quality further by implementing DFP there.Hence, our proposed work has the advantage that the second archive keeps complete information of the solution before and after LS.This way, any good solution found is not lost.It also adopts a population decreasing mechanism.

Developed Algorithm
As discussed in the literature review, LS techniques, due to their demerits, should not be used alone to solve optimization problems [2].The global optimality of global evolution techniques is very high, but they can get stuck in local regions and cannot fine tune the solution at hand.Thus, motivated by above issues of global/local techniques, we hybridize a global optimizer RJADE/TA with DFP to enhance the convergence in both regions.The new design is named as RJADE/TA-ADP-LS.We specifically handle unconstrained, nonlinear, continuous, and single objective optimization problems in the current work.

RJADE/TA-ADP-LS
The initial population is evolved globally by RJADE/TA [50] until λ% of the function evaluations; that is, after RJADE/TA's iterative mutation, crossover, selection and M [ f irst] process, as shown in Algorithm 1, the population is sorted and the current best solution w This best solution may be a parent or a child solution.The DFP is applied to the shifted elements for w iterations.After implementation of DFP, a new improved solution w [k] (i,new) is produced from an old migrant.Then, the previously explored best solution and this new solution are posted to archive M [sec] .Unlike our perviously proposed archive M [sec] in RJADE/TA, where the archive keeps the record of best solutions only and no LS is implemented, M [sec] , as mentioned above, in this method maintains information of both solutions, i.e., the migrated best solution and its improved version, if any, after implementation of DFP.
The archive M [sec] is updated after regular intervals of κ generations (20 here).The migrated solutions and those explored by DFP remain there during the entire evolution process.When the evolution process completes, the overall best candidate is selected from P p U M [sec] .The novelty of RJADE/TA-ADP-LS is that it employs DFP to the archived solutions only, unlike all hybrid designs reviewed in Section 3.
In the proposed hybrid mechanism, we implement DFP to the migrated best solution to obtain its improved form, but without reflection, as displayed in the flowchart given in Figure 1, unlike in our recently proposed work [60].Moreover, in this model, we propose adaptively decreasing population (ADP) mechanism different from the fixed population approach of Khanum et al. [60].We refer to this new hybrid as RJADE/TA-ADP-LS throughout this work.The idea of RJADE/TA-ADP-LS is novel in proposing the ADP approach, because, in the literature, majority of the evolutionary algorithms (as reviewed in Section 3) maintain a fixed population throughout the searching process.
Generate random population of size N [pop]   Implement RJADE/TA without reflection till λ generations and sort the population The ADP approach (Algorithm 2, Lines 6-8) is implemented as: Hence, Every time M [sec] is updated, the migrated element is removed from the current population P p (see Equation ( 6)), and the population is decreased by one.Thus, after each break of κ generations, r(= the number of times the κ breaks occur) solutions are removed from N [pop] , and the population size is updated to N [pop] − r, as demonstrated on Line 11 of Algorithm 2. Furthermore, the function values are updated accordingly (see Equations ( 7) and ( 8)).In ADP approach, the algorithm begins with a maximum population and terminates with a minimum population.
9: end if 10: Terminate the iteration; 11: Repeat the process r number of times and update N [pop] = N [pop] − r.

Validation of Results
In this section, first we briefly illustrate the five algorithms used for comparison and then the experimental results are presented.

Global Search Algorithms in Comparison
Among the five algorithms for comparison, the first two, RJADE/TA and RJADE/TA-LS, are our recently proposed hybrid algorithms, while the remaining three, jDE, jDEsoo and jDErpo, are non-hybrid, but adaptive and popular DE variants.

RJADE/TA
RJADE/TA [50], similar to RJADE/TA-ADP-LS, utilizes two archives for information.One of the archives stores inferior solutions, while the other keeps a record of superior solutions.However, in RJADE/TA-ADP-LS, the second archive stores elite solutions, which are then improved by DFP.Further details of RJADE/TA can be seen in Section 2.2.

RJADE/TA-LS RJADE/TA-LS [60] is a very recently proposed hybrid version of global and local search.
However, it is different from RJADE/TA-ADP-LS in the sense that it utilizes reflection mechanism and a fixed population, while RJADE/TA-ADP-LS uses DFP as LS without reflection and a population decreasing approach.[61] is an adaptive version of DE, which is based on self-adaption of control parameters F and CR.In jDE, the parameters F and CR keep changing during the evolution process, while the population size N [pop] is kept unchanged.Every solution in jDE has its own F and CR values.Better individuals are produced due to better values of F and CR.Such parameter values translate to upcoming generations of jDE.Because of its unique mechanism and simplicity, jDE has gained popularity among researchers in the field of optimization.Since its establishment, people use it to compare with their own algorithms.

jDEsoo and jDErpo
jDEsoo [62] is a new version of DE that deals with single-objective optimization.jDEsoo subdivides the population and implements more than one DE strategies.To enhance diversity of population, it removes those individuals from population that remain unchanged in the last few generations.It was primarily developed for CEC 2013 competition.
jDErpo [61] is an improvement of jDE.It is based on the following mechanisms.Firstly, it incorporates two mutation strategies, different from jDE, DE and RJADE/TA.Secondly, it uses adaptively increasing strategy for adjusting the lower bounds of control parameters.Thirdly, it utilizes two pairs of control parameters for two different mutation strategies in contrast to one pair of parameters used in jDE, classic DE and RJADE/TA.jDErpo was also specially designed for solving CEC 2013 competition problems.

Parameter Settings/Termination Criteria
Experiments were performed on 28 benchmark test problems of CEC 2013 [63].They are referred as BMF1-BMF28.The parameters' settings were kept the same as demanded in [63].The dimension n of each problem was set to 10, population size N [pop] to 100, and the MaxFEs to 10, 000 × n.The number of elite solutions r was kept as 1.The iterations number w of DFP was set to 2. The reduction of population per archive update r was also chosen as 1.The gap κ between successive updates of M [sec]  was kept as 20.The optimization was terminated if either MaxFEs were reached or the difference between the means of function error values was less than 10 −8 , as suggested in [50,63].

Comparison of RJADE/TA-ADP-LS against Established Global Optimizers
The mean of function error values, the difference between known and approximated values, for jDE, jDEsoo, jDErpo, RJADE/TA and RJADE/TA-ADP-LS, are presented in Table 2.In Table 2, + indicates that the algorithm won against our algorithm, RJADE/TA-ADP-LS; − indicates that the particular algorithm lost against our algorithm; and = indicates that both algorithms obtained the same statistics.The comparison of RJADE/TA-ADP-LS with other competitors showed its outstanding performance against all of them.RJADE/TA-ADP-LS achieved higher mean values than jDE and jDEsoo on 17 out of 28 problems; the many − signs in columns 2 and 3 of Table 2 support this fact.In contrast, jDE and jDEsoo performed better on six and eight problems, respectively.RJADE/TA-ADP-LS showed performance improvement against jDErpo and RJADE/TA algorithms as well.In general, RJADE/TA-ADP-LS performed better than all algorithms in comparison, especially in the category of multimodal and composite functions.The proposed mechanism is not only based on LS for local tuning with no reflection, but it also implements an ADP approach, which could be the reasons for its good performance.

Performance Evaluation of RJADE/TA-ADP-LS Versus RJADE/TA-LS
We empirically studied the performance of RJADE/TA-ADP-LS against RJADE/TA-LS.Table 3 presents the mean results achieved by both methods in 51 runs.The best results are shown in bold face.It is very clear from the results in Table 3 that the proposed RJADE/TA-ADP-LS performed higher than RJADE/TA-LS on 13 out of 28 problems.Furthermore, on five problems, they obtained the same results.RJADE/TA-LS showed performance improvement on 10 test problems.It is interesting to note that RJADE/TA-ADP-LS showed outstanding performance in the category of composite functions, where it solved BMF22-BMF28 better than RJADE/TA-LS.Again, the two different mechanisms, the ADP approach and the LS search with out reflection, of RJADE/TA-ADP-LS could be the reasons for its better performance.Among 28 problems, RJADE/TA-LS was better on 10 functions.Further, Table 4 presents the percentage performance of RJADE/TA-ADP-LS and RJADE/TA-LS.Since on five test problems, both algorithms showed equal results, thus we compared the percentage for the remaining 23 problems.As shown in Table 4, RJADE/TA-ADP-LS was able to solve 57% of problems against 43% of problems solved by RJADE/TA-LS out of 23 test instances.Furthermore, box plots were plotted from all means obtained in 25 runs of RJADE/TA, RJADE/TA-LS and RJADE/TA-ADP-LS.Figures 2 and 3 plot one function from each three functions.
Box plots are very good tools to show the spread of the data.Figure 2b-d shows that the boxes obtained by RJADE/TA-ADP-LS were lower than the other two boxes, indicating its better performance.Figure 2a presents the plot of BMF3, in which the two boxes in comparison were lower than RJADE/TA-ADP-LS, thus they were better.
Figure 3b,d,f shows that the boxes obtained by RJADE/TA-ADP-LS on BMF19, BMF25 and BMF27 were lower than the boxes of RJADE/TA and RJADE/TA-LS, indicating higher performance of RJADE/TA-ADP-LS.Figure 3a,c,e shows that the two other algorithms were better on the respective test instances.

Analysis/Discussion of Various Parameters Used
The number of solutions r to be migrated to archive and undergo DFP was kept as 1, since DFP is an expensive method due to gradient calculation.Further, its application to more than one solution might slow down the algorithm.The users may take two, but at most three is suggested.The number of iteration w of DFP to archive elements was kept as 2. DFP is a very good method; it could fine tune the solutions in only two iterations.Moreover, the decreasing number r of population per archive update was also chosen as 1.Since the archive was updated after regular gap of global evolution, each time population was decreased by one.However, if we reduced it by more than one solutions, then a stage would come where the diversity of the population would be decreased and the algorithm would either stop at local optima or converge prematurely.We suggest that the decreasing number be at most 3.In general, these parameters are user defined but should be chosen wisely to compliment the global and local search together, instead of premature convergence or stagnation.

Conclusions
This paper proposed a new hybrid algorithm, RJADE/TA-ADP-LS, where a LS mechanism, DFP is combined with a DE based global search scheme, RJADE/TA to benefit from their searching capabilities in local and global regions.Further, a population decreasing mechanism is also adopted.The key idea is to shift the overall best solution to archive at specified regular intervals of RJADE/TA, where it undergoes DFP for further improvement.The archive stores both the best solution and its improved form.Furthermore, the population is decreased by one solution at each archive update.We evaluated and compared our hybrid method with five established algorithms on test suit of CEC 2013.The results demonstrated that our new algorithm is better than other competing algorithms on majority of the tested problems, particularly our algorithm showed superior performance on hard multimodal and composite problems of CEC 2013.In future, the present work will be extended to constrained optimization.As a second task, some other gradient free LS methods, global optimizers and archiving strategies will be tried to design more efficient algorithms for global optimization.

1 IsFigure 1 .
Figure 1.Flowchart of RJADE/TA-ADP-LS.In this design, when the first update of M [sec] is made after half of the available resources are spent, DFP is applied to the archive members.The implementations of DFP and ADP are shown in Algorithm 2. Both the previously located best solution, w {y} [j,best] , and the one exploited by DFP, w {y} [j,new] , are propagated to M [sec].No reflection is made here to compensate the decreasing population.The ADP approach (Algorithm 2, Lines 6-8) is implemented as:

Figure 2 .
Figure 2. Box plots of various algorithms in comparison.

Figure 3 .
Figure 3. Box plots of various algorithms in comparison.

26: if y = κ then
[50]E/TA[50]is an adaptive DE variant.Its main idea is to archive comparatively best solutions of the population at regular interval of optimization process and reflect the overall poor solutions.RJADE/TA inserts the following techniques in JADE.The techniques are presented in Table1.

Table 2 .
Comparison of RJADE/TA-ADP-LS with Well Established Algorithms.