As a mutation algorithm of the genetic algorithm, the DE algorithm has become popular in research due to having fewer control parameters and higher solving efficiency. According to the critical operation steps, the optimization of the DE algorithm has mainly focused on the crossover and mutation processes. According to the different settings of the control parameters, we divided various DE algorithms into three categories and studied the specific optimization trend of DE algorithms.
  3.2.1. Deterministic Parameter Control
Deterministic parameter control means that the scaling factor F and the crossover probability CR take fixed values and do not change over time. This type of DE algorithm is relatively simple, and the most typical optimization methods include the typical DE algorithm (TDE), the opposition-based learning DE algorithm (ODE), the composite DE algorithm (CODE), and the adaptive mutation DE algorithm (AMDE).
(A1) TDE
The aim of the typical DE is to obtain a more optimized group through the correction of individual information in the group. As an essential step of the DE algorithm, the scaling factor and the difference vector of the mutation can substantially impact the solution speed and accuracy of the population. The mutation process also has a key control variable that affects the diversity of the population. Therefore, many researchers choose mutation operation as the primary trend for improving the DE algorithm. According to the initial population, the mutant individual can choose the best individual or a random individual in the current group as the primary solution. The difference vector can be solved by randomly selecting different sets in the group or selecting the best set and some random sets. Through the selection of two different parameters, researchers have developed various DE algorithms. DE represents the differential evolution principle of the system; best or rand represents the random selection and the latest selection method; and 1 or 2 represents the number of the different vector used in the system. The current mainstream mutation operations of the DE algorithm are as follows:
          where 
, 
, and 
 represent the three random indices of the fitness functions in the current group, respectively; and 
 represents the index of best fitness function in current group.
The result of the mutation cannot guarantee that the new set is within the constraints of the optimization problem. Therefore, researchers again restrict the individuals to ensure that the crossover results are within the specified constraints. The specific solution results are:
          where 
 represents the mutated individual, 
 represents the constraint range of the group in the solution process, and 
 represents the latest mutated individual.
(A2) ODE
The ODE algorithm [
53] mainly optimizes the DE algorithm by selecting the initial population. The basic principle of the initial population is based on opposition-based learning. This method mainly adds the inverse function of random data, and a new symmetry solution is generated based on the initial solution, ensuring that the initial population is more evenly distributed. The inverse function of a random number in a given range can be expressed as:
          where 
 indicates the inverse function; 
 and 
 indicate the maximum and minimum values in the given range, respectively; and 
X indicates the random number.
After solving the corresponding fitness and effective sorting, a more suitable individual is selected to meet the required population size from the initial population in the ODE algorithm. Assuming PB is denoted as the randomly selected population, PO is denoted as the inverse function of the corresponding population, and the overall population can be expressed as P = PB + PO. After the crossover operation, the population is optimized again according to the random number and the inverse function learning probability. According to the mutation operation, a “neighbor-neighbor” strategy is redefined, which is explicitly expressed as:
          where 
i and 
 are selected from the best-ranked populations, which are mainly used for the optimal detection of adjacent space; and 
 and 
 are randomly selected from the original population, which are used to improve the detection capability of the global scope.
(A3) CODE
The composite DE algorithm is developed by improving the mutation operation and control parameters. In most situations, different mutation operations have different effects on the results. So, multiple mutation operations and control coefficients are selected for the DE algorithm based on the above considerations.
The CODE algorithm randomly selects three mutations from various mutation strategies as the next generation. After each population is initialized, three different new populations are generated based on the three mutation strategies. The specific mutation operations and the coefficients can be expressed as:
          where 
, and 
 are randomly selected from the group; 
i indicates the original population; and 
 represents the result of the mutation operation.
The CODE algorithm has many mutation strategies, but the control parameter is still a fixed value. Therefore, the CODE algorithm can be regarded as an optimization algorithm with a fixed coefficient of determination.
(A4) AMDE
The AMDE algorithm also selects different mutation operations to increase the group’s diversity. Researchers combined rand and best mutation to maintain the local convergence performance and global exploration ability. When the new set is superior to the current individual, the parent individual is directly replaced to produce the next individual. The AMDE algorithm chooses the traditional rand/1 and best/2 as the basic mutation operations. rand/1 can maintain the global exploration ability of the system, and best/2 is used to maintain the convergence performance. To effectively select two parameters in the mutation process, any specific selection is based on comparing the total time of the population iteration process and the random number. The corresponding differential evolution method can be described as:
          where 
, and 
 are randomly selected from the original population; 
T represent the maximum iteration number of the problem.
Constraints are applied to the new individuals in the traditional algorithm to ensure that the population individuals are within a given range space to ensure that the system’s mutant individuals are within a given motion solution range; specifically, this can be expressed as:
          where 
 represents the maximum value and the latest value of the given population for the individual, 
 represents the random value in the range 
, 
 represents the current iteration individual, and 
 represents the new individual selected.
Generally speaking the performance of an algorithm is influenced by its explore ability and exploit ability, the random (rand) strategy can enhance the explore ability, help the group keep diversity, and prevent the group from falling into local optimum, the best strategy can improve the exploit ability of the algorithm and emphasize the rate of convergence. Compared with the DE/best/1, DE/best/2 and DE/current-best/1 algorithms, the fitness values of the DE/rand/1, DE/rand/2, ODE, and CODE algorithms converge to a small range. Due to the random strategy and the opposite strategy in the algorithms. The group shows better diversity, so their results are relatively stable. However, the precision of the algorithms may sometimes be low for the lack of local search. The performance of DE/best/1 and DE/best/2 are strongly affected by the scale factor F. Although the fitness values of the DE/best/1, DE/best/2, DE/current-best/1 and CODE algorithms converge to a large range, some of their results are highly precise for the best strategy; they can solve high-precision problems that other algorithms cannot with better exploit ability. Due to many strategies are used in the CODE algorithm, the calculation time is always long.
  3.2.3. Self-Adaptive Parameter Control
We obtain the critical parameters by applying the concept of “evolution in evolution” in self-adaptive parameter control optimization methods. Therefore, some researchers have added self-adjustment functions to the coefficient of variation F and crossover probability CR of the DE algorithm. The most typical adaptive adjustment coefficient optimization methods include JDE, JADE, and SADE.
(C1) JDE
The mutation and crosser operations strongly affect the DE algorithm when solving optimization problems. The values of the different coefficients affect a DE algorithm having the same mutation operation. To solve all kinds of problems with the global best solution, researchers proposed a new method by using additional variables to control the crosser and variation factors. This method can ensure that the two key parameters jump within a specified range and that the best variation parameters can be obtained in the next solution.
In the JDE algorithm, the crossover factor is between the maximum and minimum values in the population, which is determined by a random control variable 
. The factor CR is directly determined by random numbers, which can be affected by the control variables 
. These key parameters are expressed as follows:
          where 
F represents the maximum and minimum values; 
 and 
 represent the possible adjustment functions of the mutation and the crossover factors, respectively; 
, and 
 are three different random numbers between 
. We can control the corresponding variation and crossover factors by adopting adaptive adjustment.
(C2) JADE
The JADE algorithm is a new type of differential evolution algorithm proposed by scholars [
49]. In the DE algorithm, mutations best/1 and best/2 may cause premature convergence because the best method mainly explores the local optimum value field space.
In the algorithm, a global population is generated by superimposing the current population on the parent population. Hypothesis: The mutated individual of the population is expressed as 
A, the current population is expressed as 
P, and the overall population is expressed as 
. However, we must avoid exceeding the given population number caused by continuously increasing the number of populations. Therefore, once the overall population is achieved, we must randomly remove a population from 
. The corresponding mutation operation can be expressed as:
          where 
 is randomly chosen as one of the top 
 individuals in the current population with 
, 
 represents the random selection from the population 
, and 
 represents the best fitness index of the entire population.
In JADE algorithm, the symmetry function, random uniform distribution and Cauchy random distribution are introduced to keep the diversity of the group, the crossover probability CR is a random uniform distribution number (standard deviation is 
, the average disturbance is 0.1), and the initial value of 
 is 0.5. Similarly, the variation factor F is generated by Cauchy random disturbance (local disturbance is UF and Range 0.1), and the initial value of 
 is 0.5. Specifically, these key parameters can be expressed as follows:
          where 
 represents the random distribution number, 
 represents the Cauchy random distribution, 
 represents the average value, 
 represents the Lehmer mean, and 
 and 
 represent the corresponding mutation factor and crossover probability when the final individual result is a new population individual, respectively.
(C3) SADE
The SADE (Self-adaptive Differential Evolution) algorithm effectively combines various mutation strategies to increase the diversity of the system. The next generation of mutated individuals is generating by selecting the best individual from various individuals generated from three different mutation operations. In multiple mutation strategies, the best method has a high convergence rate, but the phenomenon of falling into local optima may occur. Additionally, the rand method has strong global detection ability. Therefore, the random mutation operation is the most basic principle used to avoid the phenomenon. At the beginning, researchers used two basic mutation operations, rand/1 and best/1. With the rapid development of basic differential evolution algorithms, the basic mutation operations are transformed into four types.
          
The selection probability of each mutation operation can be specifically described as:
          where 
 and 
 represent the number of successes and failures, respectively.
Again, the key parameter used in the next step is determined based on the success probability of various mutation operations. Then, the mutation and crossover factors are further optimized based on the NSDE algorithm to determine the key parameters. The mutation factor F is randomly selected, and the crossover factor is further determined by the success efficiency of various mutation parameters, which can be expressed as:
          where 
 represents the symmetry function is—random normal distribution.
(C4) ISAMDE
The ISAMDE algorithm combines the AMDE and adaptive scaling factor AF strategies. The AMDE strategy is same as that of the AMDE (A4) algorithm mentioned above. The AF strategy improves the performance of the DE algorithm by considering the current iteration number and error value: it can effectively balance the exploration and exploitation abilities of the algorithm. At the beginning of the iteration, the scaling factor is assigned a value of one, so the exploration ability is strong. The error value is affected by the current iteration number of the algorithm, which is less than the set value 
. The AF can be expressed as:
          where 
 represents the error, and the value of 
 is related to the specific engineering problems.
(C5) ENMDE
The ENMDE algorithm combines the self-tuned mutation strategy with the leading group selection strategy. The self-tuned mutation strategy includes four typical DE algorithms: DE/rand/1, DE/best/1, DE/rand/2, and DE/best/2. The TDE algorithms can be divided into two groups: the rand group, which includes DE/rand/1 and DE/rand/2; and the best group, which includes DE/best/1 and DE/best/2. The rand and best groups are selected by comparing the fitness distance ratio with the average fitness distance ratio. They are expressed as follows:
          where 
 represents the average fitness distance ratio; 
 represents the fitness distance ratio; 
 represents the fitness function of the solution under consideration d; 
 represents the population size; 
 represents the fitness function of the best solution; 
 represents the mean value of all individuals’ fitnesses. The process of self-tuned mutation is described in detail in the following table. After the group is selected, the final mutation strategy is selected by considering the mutation mode factor (MMF). The specific process is expressed as 
Table 1:
The leading group selection strategy is also used in this improved DE algorithm, where the individuals  and  are integrated into a new swarm, and then Np best solutions are selected and assigened to x, where u represents the individual are operated after crossover.
(C6) PDcDE
In the PDcDE algorithm, the swarm size is auto-controlled, the population is divided into two subpopulation–
,
, the best 
 individuals are used to exploit new solution and the left 
 individuals are used to explore new solution. The mutatant strategies are formulated as:
          where 
 represents the individuals which are selected from the first group (
). 
, 
, 
, 
, 
 and 
 are randomly generated different indices, 
, 
 and 
 are selected from the first subpopulation (
), while 
, 
, 
 are selected from the whole population. The swarm size 
N is dynamically changed by considering whether the current best solution is updated or not, if it updated, a parameter 
 is used to count by increasing one, when it reached a preset threshod value L, the worst individual will be removed until the swarm size reach the minimum size 
. If the current best solution is not updated, a parameter 
 is also used to count by increasing one, when it reaches a preset threshod value L, a randomly individual is introduced into the swarm until the swarm size reaches the maximum size 
. The diversity-controlled paremeter setting is another strategy in the PDcDE algorithm, the scale 
 and 
 are for the 
 subpopulation and 
 subpopulation, respectively. They are dynamically generated by considering the population diversity. The diversity at the 
jth dimension is expressed as:
          where 
 represents the diversity of the swarm at the 
jth dimension. 
 represnts the 
jth element in the 
ith individual. 
 represents the arithmetic mean of the swarm at 
jth dimension. The scale factors of 
 and 
 are expressed as:
          where 
 and 
 represents the population diversity of the 
 group and 
 group, respectively.
The characteristic of diversity will lead the swarm search in different way, if the diversity is large, the swarm will search and explore in the large search space, conversely, if the diversity is small, the swarm will search and exploit in a small search space.
Finally, a backward search strategy is also proposed, it replaces the crossover solution if current best solution is not updated in successive L iteration, the formulations can be expressed as:
          where 
 is a trial solution, 
 representes the muatated solution in current generation, the rand is generated between 0 and 1.
The parameters F and CR which are needed to be set for variation and crossover in the JADE algorithm can be selected adaptively without artificial setting. The algorithm sometimes need enough iteration times and population size to obtain the optimal solution. It performs better in solving high-dimension problems but weak in dealing with low-dimension problems. JADE’s performance depends heavily on the setting of parameters, and different parameter settings may lead to different results. Due to the parameter F or CR are changed, the convergence accuracies of JDE, SADE, and PDcDE are high; however, the convergence rate sometimes are slow for the computational complexity. The stability of the PDcDE algorithm is high; however, the algorithm contains many redundant calculations. Because the rand strategy or best strategy of ENMDE are adaptive, the algorithm shows high precision. However, the calculation time sometimes maybe long for the complex judgment process. The influence of dynamically swarm size on the calculation results are uncertain. The ISAMDE usually shows better convergence rate and precision for taking advantage of the AMDE, however, sometimes the scale factor F and other parameters are difficult to set, the performance of the algorithm is largely depends on parameter settings, different parameter settings may lead to different results. So the scale factor and other parameters are always set based on experience.