An Improved MOEA / D with Optimal DE Schemes for Many-Objective Optimization Problems

MOEA/D is a promising multi-objective evolutionary algorithm based on decomposition, and it has been used to solve many multi-objective optimization problems very well. However, there is a class of multi-objective problems, called many-objective optimization problems, but the original MOEA/D cannot solve them well. In this paper, an improved MOEA/D with optimal differential evolution (oDE) schemes is proposed, called MOEA/D-oDE, aiming to solve many-objective optimization problems. Compared with MOEA/D, MOEA/D-oDE has two distinguishing points. On the one hand, MOEA/D-oDE adopts a newly-introduced decomposition approach to decompose the many-objective optimization problems, which combines the advantages of the weighted sum approach and the Tchebycheff approach. On the other hand, a kind of combination mechanism for DE operators is designed for finding the best child solution so as to do the a posteriori computing. In our experimental study, six continuous test instances with 4–6 objectives comparing NSGA-II (nondominated sorting genetic algorithm II) and MOEA/D as accompanying experiments are applied. Additionally, the final results indicate that MOEA/D-oDE outperforms NSGA-II and MOEA/D in almost all cases, particularly in those problems that have complicated Pareto shapes and higher dimensional objectives, where its advantages are more obvious.


Introduction
In daily life and work, multi-objective optimization problems (MOPs) have a wide range of applications, such as water distribution systems [1], land use management problems [2], automotive engine calibration problems [3], and so on, which are all real-world applications of MOPs.All of these problems are needed to optimize two or more than two objectives simultaneously.Without a doubt, everyone wants to make all objectives achieve optimization.However, we cannot find a solution to achieve this goal, because all of the objectives are restricting each other.If one objective achieves the optimization, the other objectives must achieve more useless solutions, which the decision makers do not want to see.Due to the importance of MOPs, scholars have been trying to solve this effectively.In order to facilitate the study of MOPs, scholars summed up the general mathematical expression of the MOPs.Therefore, a multi-objective optimization problem (MOP) can be mathematically defined as follows: minimize F(x) = ( f 1 (x), f 2 (x), ..., f m (x)) T subject to x ∈ Ω where F : Ω → m is made up of m real-valued objective functions, Ω is the decision (variable) space and m is called the objective space.
Given two objective vectors: u, v ∈ m , vector u is said to dominate v, which means if and only if u i ≤ v i for every i ∈ {1, . . ., m} and there must exist at least one j ∈ {1, . . ., m} satisfied with u j < v j .Additionally, if F(x) dominates F(y), a solution x ∈ Ω is said to dominate another solution y ∈ Ω.A solution x * ∈ Ω is defined as a Pareto-optimal solution in the case that F(x * ) is not dominated by any other solutions in the objective space.Then, F(x * ) is called a Pareto-optimal objective vector.The set of all the Pareto-optimal solutions is known as the Pareto set (PS).In addition, the set of all of the Pareto-optimal objective vectors is the Pareto front (PF) [4,5].Additionally, any improvement of one objective in the Pareto-optimal solutions is bound to the decline of at least one other objective.
In most real-world applications, because the objectives in (1) conflict with each other, there is no solution that could minimize all of the objectives simultaneously.Because of the complexity of MOPs, the traditional mathematical methods will meet different levels of difficulty while solving the MOPs.Nevertheless, the evolutionary multi-objective optimization (EMO) algorithm based on natural evolution can solve the MOPs exceedingly well to some extent.Over the past twenty years, EMO algorithms and their applications came into being and developed rapidly, which has attracted a large number of researchers to study them [6][7][8][9][10][11][12][13].
Owing to the research of EMO, the evolutionary algorithm (EA) has some certain advantages in obtaining the multiple Pareto-optimal solutions; for example, it always can obtain more than one optimal solution in a single run.Therefore, it is very common to use multi-objective evolutionary algorithms (MOEAs) to solve MOPs.Many EMO algorithms did the fitness assignment through using Pareto dominance, which are called the domination-based algorithms.In these algorithms, they tried to find a set of solutions, which should be as approximate as possible to the true PF, and this is known as convergence.The allocation of the fitness scheme to solutions is based on the Pareto-dominance principle, and the convergence of the basic algorithm plays a pivotal role in this principle.On the other hand, these algorithms need an explicit diversity preservation scheme to maintain a diverse set of solutions.In this way, two of the most prevailing MOEAs are NSGA-II [14] and SPEA2 (the improved version of strength Pareto evolutionary algorithm) [15].They have achieved remarkable results in solving MOPs.Although the dominance-based algorithms are prevailing, they also have their own limitations to overcome.One of the main drawbacks is that they are not very suitable for solving the many-objective optimization problems.Because in the face of the many-objective optimization problems, almost all of the solutions will become non-dominated with respect to each other in the population, this will reduce the selection pressure and hinder the process of evolution [16,17].
With the continuous in-depth study of scholars around the world, the recent well-known MOEA, the multi-objective evolutionary algorithm based on decomposition, named MOEA/D [18], proposed by Zhang and Li in 2007, was dissimilar from the past algorithms.MOEA/D does not use the Pareto dominance, but uses the scalarization technique for fitness assignment to solve the multi-objective optimization problems.It uses scalarizing functions such as the weighted sum approach and the weighted Tchebycheff approach to decompose a MOP into a number of single subproblems and solves all of these subproblems in a single run (note that the description of the weighted sum approach and the weighted Tchebycheff will be shown in Section 2.2).Each subproblem is relevant to a weighted scalarizing function and neighborhood relationship, which should be considered between any two sub-problems.These algorithms based on MOEA/D in solving continuous and combinatorial MOPs have already achieved extremely promising results [19][20][21].
On the other hand, as an efficient and vigorous heuristic for the global optimization algorithm, differential evolution (DE) was proposed by Storn and Price [22,23].As a prevalent EA, DE exhibits outstanding performance in solving a wide variety of problems in different fields.It employs crossover, mutation and selection operators to move its population toward the global optimum in each generation.DE as a successful EA is widely used to cope with many optimization problems, as well as it has good adaptability, so that it can be combined with other algorithms to solve MOPs at another level.
In the course of [24], a simple DE scheme is combined with MOEA/D, called MOEA/D-DE, which solves a class of new MOPs that have complicated Pareto sets.In this version of MOEA/D, called MOEA/D-DE, mating parents are selected from the neighborhood and the whole population together.It also points out that the parameter setting in the DE scheme is very important for the performance of the MOEA/D-DE algorithm.The successful implementation of MOEA/D-DE not only shows the good performance of MOEA/D, but also shows the good adaptability of the DE algorithm.Therefore, we can make some effort following this means to attempt to solve more difficult MOPs.
Moreover, it is undoubted that ( 1) is a multi-objective optimization problem and usually has two or three objectives.Nevertheless, in daily life, many real-world applications often involve four or more objectives for optimizing.Thus, we say that many-objective optimization problems are the class of MOPs that have four or more than four objectives.Because of its usefulness, it is not strange that the research on many-objective optimization problems has been one of the major research areas in the EMO community during recent years.To name only a few, in 2013, Tan et al. [25] proposed an improved MOEA/D, named UMOEA/D (MOEA/D with uniform design).They used the uniform design method to set the weight vectors of MOEA/D for solving the many-objective problems and got some effective results.Qi et al. [26] proposed an improved MOEA/D in the weight vector design method in 2014, called MOEA/D-AWA (MOEA/D with adaptive weight vector adjustment).They used a new weight vector initialization method and made the weight vectors adaptive.There were many experiments on MOPs and a small amount of experiments on the many-objective optimization problems, which certified how effective MOEA/D-AWA was.In 2014, Deb and Himanshu [27] suggested a reference point-based NSGA-II for many-objective problems, similar to NSGA-II, called NSGA-III.NSGA-III still employed the Pareto non-dominated sorting method and added the decomposition concept to solve the many-objective optimization problems.In 2015, Li et al. [28] proposed a new algorithm that exploited the merits from both the dominance-and decomposition-based approaches to balance the convergence and diversity of the evolutionary process to solve the many-objective optimization very well.From these points of view, while solving this kind of many-objective optimization problem, the original MOEA/D will meet some limitations, such as the distributions of weighted vectors not being very uniform, and it just uses a single approach to decompose the MOPs.All of these limitations are encouraging researchers to study it.This provides the major impetus for us to study how to improve the performance of MOEA/D on solving the many-objective optimization problems.Therefore, an improved MOEA/D with optimal DE schemes (MOEA/D-oDE) is proposed in this paper.Our purposes are to employ and use the advantages of different DE schemes.Firstly, MOEA/D-oDE uses the alterable decomposition approach.Besides, MOEA/D-oDE uses the optimal DE schemes.More precisely, the proposed algorithm MOEA/D-oDE combines the weighted sum approach and the Tchebycheff approach to produce a new scalarizing function to decompose the many-objective optimization problems.Additionally, we use the alterable DE's trial vector operators to improve the performance of MOEA/D in solving the many-objective optimization problems.
The remaining part of this paper is arranged as follows.Section 2 presents the preliminaries of our study, including the basic idea of MOEA/D, the basic decomposition approaches and a simple introduction to the DE's trail vector design methods.Section 3 presents our newly-proposed algorithm, named MOEA/D-oDE, which attaches importance to how to decompose the many-objective optimization problems and how to make the results of the DE operators perform well.Section 4 presents the experimental studies on MOEA/D-oDE, comparing MOEA/D-oDE with MOEA/D and NSGA-II in solving the many-objective problems, and the experimental results demonstrate that MOEA/D-oDE outperforms or performs similarly to MOEA/D and NSGA-II.Finally, Section 5 concludes this paper concisely.

Preliminaries
In this section, there is some basic knowledge that will be employed in this paper.First of all, the basic idea of MOEA/D is introduced in Section 2.1, which presents briefly how MOEA/D works, but not providing the framework and details of MOEA/D.Thereafter, the decomposition approaches used in MOEA/D are introduced in Section 2.2, which introduces the weighted sum approach, the Tchebycheff approach and the penalty-based boundary intersection approach, respectively.Eventually, how trail vectors of differential evolution are generated is interpreted in Section 2.3 in detail.

Basic Idea of MOEA/D
MOEA/D is a representative algorithm based on decomposition, which was proposed by Zhang and Li.In addition, Zhang and Li had proven that it is superior to the other state-of-the-art MOEAs in 2007 [18].Moreover, a variant of MOEA/D, which is named MOEA/D-DRA (MOEA/D with dynamical resource allocation), had achieved first place at CEC2009 (2009 IEEE Congress on Evolutionary Computation) [29].The basic idea of MOEA/D is using some kind of aggregation function to decompose an MOP into some single objective optimization sub-problems and to use some other evolutionary algorithm to optimize them simultaneously.
In the general MOEA/D framework, let N be the population size, m be the number of the objectives, for all i = 1, . . ., N, and λ 1 , λ 2 , . . ., λ N be a set of aggregation coefficient weight vectors, where λ i = (λ 1 i , λ 2 i , . . ., λ m i ) T , j = 1, . . ., m, and subject to λ With these weight vectors, MOEA/D employs a decomposition method (discussed in Section 2.2) to decompose an MOP (1) into single objective optimization subproblems and minimizes all of these N subproblems in a single run.Each subproblem i corresponding to vector λ i is optimized by using the information just from its neighborhood.The neighborhood of weight vector λ i is defined as a group of its T closest weight vectors in the entire set {λ 1 , λ 2 , . . ., λ N }.Therefore, the neighborhood of the i-th subproblem is composed of all of the subproblems whose weight vectors are from the neighborhood of λ i .The constantly-updated population consists of the best solution found up to now for each subproblem.Only the present solutions to these neighboring subproblems are utilized for optimizing a subproblem in the algorithm.For more details about MOEA/D, readers may refer to [18].

Decomposition Approaches Used in MOEA/D
In the study of the original MOEA/D, a total of three kinds of decomposition approaches are introduced simply here.They are the weighted sum approach (WS), the Tchebycheff approach (TCH) and the penalty-based boundary intersection approach (PBI).A brief introduction of them is shown in the following: (1) Weighted sum (WS) approach: The optimization problem of the WS approach is defined as: where λ = {λ 1 , λ 2 , . . ., λ N } is a set of weight vectors, for all i = 1, . . ., N, λ i = (λ 1 i , λ 2 i , . . ., λ m i ) T , and subject to λ j i ≥ 0, ∑ m j=1 λ j i = 1 for all j = 1, . . ., m.This approach could work well if the PF is convex (for minimization problems).Nevertheless, not every Pareto-optimal vector can be obtained through this approach if the PF is non-convex.
As we know, there must exist a weight vector λ for every Pareto-optimal point x * , such that x * is the optimal solution of (3).In addition, every optimal solution of (3) should be a Pareto-optimal solution of (1).Consequently, we can obtain different Pareto-optimal solutions by altering the weight vectors.(3) Penalty-based boundary intersection (PBI) approach: The optimization problem of the PBI approach is defined as: where In ( 4), λ is the weight vector as defined in (2), z * is the ideal reference point as defined in (3) and θ > 0 is a penalty parameter.From the previous study, using the same distributed weighted vectors, the PBI approach has some advantages over the TCH approach when solving an MOP whose number of objectives is more than two.However, the benefits must come with a price, which means the θ as a penalty parameter requires proper adjusting.
Some of these approaches have been successfully used in solving real-world problems.For example, in [8], Tan et al. used the weighted sum approach to design a pattern reconfigurable array antenna and got some certain results.In [20], Ke et al. used the Tchebycheff approach to solve the multi-objective traveling salesman problem (MTSP) and multiobjective 0-1 knapsack problem (MOKP) well.

Differential Evolution's Trial Vectors
DE is a kind of algorithm that deals with continuous optimization problems.To illustrate the basic concepts of DE, suppose that the objective function for this part is minimizing the f ( − → x ), − → x = (x 1 , x 2 , . . . ,x D ) ∈ D , and the feasible solution region is At generation G = 0, a random initial population { − → x i,0 = (x i,1,0 , x i,2,0 , . . . ,x i,D,0 ), i = 1, . . ., N, j = 1, . . ., D} is generated from the feasible solution region, where N is the population size.In the evolution of the DE algorithm, for each generation G of the current population, DE will create a mutation vector , which is called a target vector.Five kinds of widely-used DE mutation operators are shown as follows.
(1) "DE/rand/1" (2) "DE/best/1" (3) "DE/current-to-best/1" (4) "DE/best/2" (5) "DE/rand/2" In the above five equations, r1, r2, r3, r4 and r5 are the independent individuals randomly selected from the population size and are different from i. − → x best,G is the first-rate individual in the current population.Additionally, the parameters F, F 1 and F 2 are called the scaling factors, and F and F 2 are usually equal; all of them control the degree of mutation.
After mutation, DE employs a crossover operator on − → v i,G and − → x i,G to generate a trial vector − → u i,G and − → u i,G = (u i,1,G , u i,2,G , . . ., u i,D,G ), described as follows: where i = 1, 2, . . ., N, j = 1, 2, . . ., D, rand j (0, 1) is a random number uniformly distributed from zero to one, requiring to be generated for each j.Additionally, Cr ∈ [0, 1] is called the control parameter for crossover.The number j rand is an integer in [1, D], which is also chosen randomly.Using j rand , the trial vector − → u i,G could always be different from the target vector − → x i,G .More details about DE and the DE's operator may be found in [22,23,30].

MOEA/D with Optimal DE Schemes
In order to solve the many-objective problems well, this paper proposes an improved MOEA/D with optimal DE schemes, named MOEA/D-oDE.Our main purposes are to explore and employ the advantages of different DE schemes.Apart from this, a combined decomposition approach is introduced and adopted in the proposed MOEA/D-oDE.Therefore, the combined decomposition method is given in Section 3.1, and the method of optimal DE schemes is given in Section 3.2 to introduce them, respectively.The final subsection will give the framework of the proposed algorithm, MOEA/D-oDE, and interpret it briefly.

The Combined Decomposition Method
As shown in Section 2.2, three decomposition approaches are introduced, on account of the friendly expansibility of the weighted sum approach and the Tchebycheff approach.We combine them in advance and propose a combined decomposition method, named the weighted sum δ Tchebycheff (WST) approach.Here, we refer to a concept from [31]: for subproblem i, let x i be the current solution in the decision space; the improvement region of x i can be defined as: {F(x)|x is better than x i for subproblem i}.As illustrated in Figure 1, for two objective optimization as an example, λ i = {0.5, 0.5}, i = 1, . . ., N, the improvement regions of two commonly-used decomposition approaches, (a) the weighted sum approach and (b) weighted Tchebycheff approach, are shown.In each sub-figure, the grey area is the improvement region.The square point is the current solution of the subproblem i with the direction vector and the triangle point being its optimal solution.Thereafter, the optimal target of the square point is to find its corresponding triangle point.As described in Equations ( 2) and (3) and from Figure 1, we inform the reader that the weighted Tchebycheff approach barely gives priority to the salient objective, while on the contrary, the weighted sum approach attaches importance to all of the objectives.Therefore, we would like to combine these two approaches together by considering them on both a salient field and an entire field.A combined decomposition method, named the WST method, is introduced.The optimization problem of the WST approach is defined as: where m is the number of objectives, λ is the weight vector as defined in (2) and z * = (z * 1 , ..., z * m ) is the reference points as defined in (3).For each i = 1, . . ., m, z Additionally, we would like to point out that the WST method combines the advantages of the weighted sum and the weighted Tchebycheff approach.It not only emphasizes one single aspect, but also pays attention to the entire direction of the objects.Our experimental results have proven its validity in the end.

The Optimal DE Schemes
As presented in Section 2.3, there are five commonly-used DE mutation operators.In our algorithm, in order to make full use of DE operators and improve the search ability, due to the previous literature [22,32], and the fact that the value of F could always be different for different problems, we decide to choose the set of Equations ( 6) and ( 8) and the scaling factors set of {0.3, 0.5} to generate trial vectors used in the later evolutionary computation to do our experiments.In this case, we could get four trial vectors (Operator (6) with F = 0.3, Operator (6) with F = 0.5, Operator (8) with F = 0.3 and Operator (8) with F = 0.5) in a single run, and then, we would like to find and select the best child solution for the a posteriori evolution.Besides, for Formula (8), the value of F 1 is suggested to be a random number, which is uniformly distributed from zero to 0.5, which is beneficial for the convergence.Figure 2 shows the flow chart of our method roughly.In the end, the experimental results have proven its effectiveness.equation ( 6) equation (8) factor F = 0.3 factor F = 0.5 1. equation ( 6), F = 0.3 2. equation ( 6), F = 0.5 3. equation ( 8), F = 0.

Framework of MOEA/D-oDE
MOEA/D-oDE decomposes a many-objective problem into a set of single objective problems and solves them in parallel.Each decomposed single objective problem is handled with optimal DE schemes.The framework of MOEA/D-oDE is based on MOEA/D.We would like to point out that there are two distinguishing points from MOEA/D: (1) we select the WST decomposition approach to decompose the many-objective optimization problems; (2) optimal DE schemes instead of single DE operator schemes are employed in our experiments.
Following the recounted methods, at each generation, MOEA/D-oDE with the WST approach maintains the following: (1) a population of N points x 1 , x 2 , . . ., x N , where x i is the present solution to the i-th subproblem; (2) reference ideal point z = (z 1 , z 2 ..., z m ) T , where z j is the best value for objective f j that is found up to now, and ∀j = 1, 2, . . ., m; (3) the external population (EP), which is used to store the non-dominated solutions that have been found in the search process.
The pseudocodes of our proposed algorithm MOEA/D-oDE are shown in Algorithm 1.As shown in Algorithm 1, in the initialization procedure of MOEA/D-oDE (Lines 8-17), we dispose of the neighborhood structure B(i) by computing the Euclidean distances between any two weight vectors and then arrange the T closest weight vectors with respect to each weight vector firstly.Furthermore, we generate the initial population, and for ∀i = 1, . . ., N, evaluate F(x i ) respectively, and mainly initialize z by the condition: z j = min 1≤i≤N f j (x i ), ∀j = 1, . . ., m in the end.In the update procedure of MOEA/D-oDE (Lines 19-27), if the stopping criteria are not satisfied, then go to Step 2.1 firstly, which does the reproduction by using optimal DE schemes, and then, the values that overstep the boundary should be repaired in Step 2.2.Secondly, we evaluate the function and update of z by the new minimum evaluation.Finally, through the g wst (x |λ j , z) ≤ g wst (x j |λ j , z), we renew the best solutions and obtain the final results.Step 2.4) Update of z: For each j = 1, . . ., m, if z j > f j (x ), then set z j = f j (x );

24
Step 2.5) Replacement/update of solutions: For each index j ∈ B(i), if 25 g wst (x |λ j , z) ≤ g wst (x j |λ j , z) then set x j = x and F(x j ) = F(x );

26
Step 2.6) Remove from EP all of the vectors dominated by F(x ).Add F(x ) to EP if no 27 vector in EP dominates F(x ).

Experimental Study
In this section, the experimental studies are shown to validate the effectiveness of MOEA/D-oDE.Test instances are used for testing the performance of the algorithms, and performance metrics are used for reflecting the performance of the algorithms, which are exhibited firstly.Additionally, there are three parts of the experiments in the following subsection.One compares MOEA/D with the WST approach to the other two approaches (the WS and the Tchebycheff approach).The other compares MOEA/D-oDE with MOEA/D [18] (MOEA/D with the Tchebycheff approach) and NSGA-II [14].In the end, MOEA/D-oDE with MOEA/DD (a recent algorithm proposed by Li et al. which solves many-objective optimization problems well) [28] and MOEA/D-AWA [26] are compared as further experiments to reflect the performance of MOEA/D-oDE.

Test Instance
In our experiments, we choose the family of the four DTLZfunctions that are of a scalable objective [33] for benchmark test instances.Additionally, the other two scalable continuous test instances (F1 and F2) which have complicated PS shapes are also employed in our experiments.Taking into account the limited space, we put the mathematic representations of all test functions in the Appendix, where we can find that the DTLZ functions' family is different from the F1 and F2.Additionally, we can learn from the previous literature [28,33] that DTLZ1 is linear and multi-modal, DTLZ2 is concave, DTLZ3 is concave and multimodal and DTLZ4 is concave and biased.On the other hand, F1 and F2 have variable complicated PS shapes, which have certificated that these problems could cause some trouble for the algorithms.More details of the six continuous test instances can be seen in [25].Here, all of these test instances are for minimization.

Comparison Algorithms and Performance Metrics
In this paper, we compare MOEA/D with the WST approach to MOEA/D with WS and the Tchebycheff approach, named MOEA/D-WST, MOEA/D-WS and MOEA/D-TCH, respectively, to test the performance of the WST approach firstly.For this part, we test the DTLZ1-DTLZ4 with 4-6 objectives representatively.After that, we compare MOEA/D-oDE with two other well-known algorithms, MOEA/D [18] and NSGA-II [14], on all six test instances.Lastly, MOEA/DD [28] with DE operators and MOEA/D-AWA [26] have been tested to compare MOEA/D-oDE on DTLZ1-DTLZ4 with four objectives representatively.
As shown above, MOEA/D is a famous MOEA based on decomposition, and MOEA/D-oDE is developed from the MOEA/D, so we choose the MOEA/D to test all six test instances at the same time, and we should point out that the Tchebycheff approach is used in MOEA/D for comparing.In addition, NSGA-II is the most widely-cited algorithm in the MOEA field, which is based on the Pareto-dominance principle.Although NSGA-II is relatively long, its influence is still profound.On the other hand, MOEA/DD and MOEA/D-AWA are all of the recently-proposed promising algorithms.In other words, all four algorithms are influential, so it is worth comparing them.In addition, we should point out that the results of MOEA/DD are run by "PlatEMO" [34], and we acknowledge the team that made this.The source codes of MOEA/D-AWA can be found on the personal web sites of the author Qi.
During the experiments, the inverted generational distance (IGD) [35][36][37] metric and the generational distance (GD) [38] are used to determine the quality of an algorithm.The IGD metric value can reflect the convergence and diversity of the results obtained by algorithms.Let P * be a group of uniformly-distributed Pareto-optimal points in PF, as well as P be a group of points for PF approximation; then, the IGD metric can be defined as follows [10,37]: where d(v, p) is the minimum Euclidean distance between v and any point in P and |P * | is the cardinal number of P * .If P * is a set of large points that could represent the PF well, in a way, it means that the IGD(P * , P) could measure the diversity and convergence of P. Therefore, P must be very close to the true PF and does not miss any part of the PF in order to have a smaller IGD(P * , P) value.Thus, we can conclude that the smaller the IGD metric value is, the better the algorithm performs.
For generational distance (GD) [38]: Let P * be a set of uniformly-distributed points in the objective space along the PF and P be an approximation set to the PF; the generational distance from P to P * is defined as: where m is the number of objectives and d(v, P * ) is the minimum Euclidean distance between v and the points in P * .The generational distance could indicate the distance between the approximate Pareto-optimal front obtained by the algorithms and the ideal Pareto-optimal front distances.Consequently, the value of GD(P, P * ) is smaller, which means the the convergence of the approximate Pareto-optimal front obtained by the algorithms is better; meanwhile, it is closer to the ideal Pareto-optimal front distances.The best situation is that while GD(P, P * ) = 0, it means that all solutions in P are the Pareto-optimal solutions.In our experiments, the number of the solutions in setting P * selected to calculate the value of the IGD metric is 1000 uniformly-distributed points for four-objective problems, 2000 uniformly-distributed points for five-objective problems and 3003 uniformly-distributed points for six-objective problems.

Parameter Setting
Almost all of these algorithms are implemented by using C++ programming, and the implementation environment is Visual C++ 6.0 on a personal computer (Intel Core 3.8 GHz CPU and 8.0 GB RAM).In the implementation of MOEA/D-oDE, we carry out the trial DE vector operators and polynomial mutation on all of the test instances.Meanwhile, for the algorithms compared to MOEA/D [18,24], the simulated binary crossover (SBX) [4] operator is employed for DTLZ problems, and the DE crossover operator is employed for F1 and F2.Polynomial mutation is used in MOEA/D for all of the test problems.Evolution operators used in NSGA-II are the same as in MOEA/D.The other parameters used in MOEA/D and NSGA-II are shown in Table 1, where n is the number of variables.Additionally, the parameters of MOEA/DD and MOEA/D-AWA are the same as the parameters used in MOEA/D, and the probability of choosing parents locally of MOEA/DD is set to 0.9.Moreover, in MOEA/D, both the number of objectives m and the parameter H control the number of set {λ 1 , ..., λ N }, where the population size N is the amount of sub-problems.Each individual weight vector ought to take a value from {0/H, 1/H, ..., H/H}.Therefore, the population size, which means the number of weight vectors, is N = C m−1 H+m−1 , where m is the number of objectives.Additionally, for NSGA-II, the number of the non-dominated solutions would increase as the number of objectives increases [39].For fair comparisons, we set the values of H as 17, 13 and 10 for all of the test instances with 4, 5 and 6 objectives, respectively, in MOEA/D.Hence, we can obtain that the values of the population size N are 1140, 2380 and 3003 for 4-objective, 5-objective and 6-objective continuous test problems respectively in all three algorithms.The number of the neighborhood size is set to T = 20 for all of the test instances in MOEA/D-oDE and MOEA/D.Each compared algorithm runs 20 times independently for each instance.When the number of function evaluations reaches the given upper limitation, three algorithms run out.The max number of the function evaluations is set to 250 × N for four-and five-objective problems and 500 × N for six-objective problems.

Experimental Results and Analysis
In this subsection, three parts of experimental results and analyses are shown respectively in the following.

MOEA/D-WST Compared with MOEA/D-WS and MOEA/D-TCH
Since the proposed WST approach is a combination of the weighted sum and the Tchebycheff approach, we have tested the MOEA/D-WST, MOEA/D-WS and MOEA/D-TCH on DTLZ1-DTLZ4 with 4-6 objectives.These three algorithms function with the same condition without the decomposition approach.Table 2 shows the IGD metric values obtained by three algorithms, including the mean, the best and the standard values of the 20 IGD metric results.Table 3 shows the GD metric values obtained by the three algorithms, including the mean, the best and the standard values of the 20 IGD metric results, as well.We can find from these two tables that the MOEA/D-WST has better performance on both the IGD metric and the GD metric values for most of the instances.This is because the WST approach has the features of both the weighted sum approach and the Tchebycheff approach.Therefore, MOEA/D with the WST approach has some certain superiorities over MOEA/D with the weighted sum approach and MOEA/D with the Tchebycheff approach.All compared algorithms run 20-times independently for each test problem in the process of our experiments.Table 4 shows the IGD metric values obtained by three algorithms, including the mean, the best and the standard values of the 20 IGD metric results.From the table, we can see that the performance of MOEA/D-oDE is better than the other two algorithms except for the DTLZ4 problem with six objectives.For this problem, NSGA-II performs the best.Because of this, we cannot say that the decomposition-based methods must be better than dominance-based methods.Apart from this instance, for the other test instances, MOEA/D-oDE performs better or much better than MOEA/D and NSGA-II.Meanwhile, NSGA-II performs better than MOEA/D on the DTLZ function family with 4-5 objectives.However, for F1 and F2, MOEA/D always performs better than NSGA-II, no matter with many objectives.With the number of objectives increasing, MOEA/D compared to NSGA-II becomes better on DTLZ1 and DTLZ3 with six objectives.For this, we can see that MOEA/D with DE operators in dealing with problems with complicated PS shapes has its unique advantages, but for problems such as the DTLZ functions' family, its superiority is not very obvious.Nevertheless, we have used the DE operators for crossover in MOEA/D-oDE for all of the test problems.In terms of the IGD metric values, the results obtained by MOEA/D-oDE are much better than those obtained by MOEA/D and NSGA-II for almost all of these test problems.Its advantages are very clear, especially for F1 and F2.Then, we can conclude that MOEA/D-oDE can handle the many-objective optimization problems well, whether the problems have complicated PS shapes or not.This also indicates that MOEA/D-oDE is robust for solving these problems.
On the other hand, Table 5 shows the GD metric values obtained by three algorithms, including the mean, the best and the standard values of the 20 IGD metric results, as well.From the table, we can see that the mean GD metrics of MOEA/D-oDE are superior to MOEA/D and NSGA-II on most test instances, as well as almost all of the best values of the GD metric belong to MOEA/D-oDE.However, the stability of MOEA/D-oDE is not very good.Through this table, we can obtain the same conclusions as Table 4.
The evolution of the mean IGD metric values based on 20 independent runs for three algorithms is plotted in Figures 3, 4 and 5, respectively.As shown in these figures, we can find that the convergence speed of the MOEA/D-oDE is faster than the other two algorithms.However, there are many final IGD metric values obtained by NSGA-II that are lower than those obtained by MOEA/D.The convergence speed of NSGA-II is not faster than the other two algorithms in most instances.Meanwhile, the runtime of NSGA-II is the longest of the three algorithms.For NSGA-II, on F1 and F2 with 5-6 objectives, in the later stage of evolution, the obtained IGD metric values do not decrease, but increase.Hence, this indicates that NSGA-II is not good at solving the many objective problems with complicated PS shapes in another way.
As described in Figures 6, 7 and 8, box plots of the IGD metric values among these three algorithms for the continuous problems with 4-6 objectives based on 20 independent runs are visualized.The box diagram can reflect the stability of a set of data.From Table 4 and those three box diagrams, we can conclude that MOEA/D-oDE performs more stably, especially for the F1 and F2 test instances.On the contrary, though NSGA-II has superior performance with respect to MOEA/D on some instances, its stability is usually not very good.The reason may be that the original NSGA-II could not solve the problems with multi-modality and bias very well.Additionally, the stability of MOEA/D does well in some test instances.In some of these box figures, such as Figures 6 and 8, there are some '+' on some test instances, which means the outliers.We can see that MOEA/D and NSGA-II have some larger outliers, which means that the two algorithms may perform worse in one of 20 runs.In other words, to some extent, MOEA/D and NSGA-II do not behave stably on some instances, such as DTLZ4 with four objectives.However, compared to MOEA/D and NSGA-II, the performance of MOEA/D-oDE is more stable.6 and 7 have shown the IGD metric and GD metric values obtained by the three algorithms, including the mean, the best and the standard values of the 20 results.From Table 6, the IGD metrics of MOEA/D-oDE have some superiorities to MOEA/DD and MOEA/D-AWA, but in Table 7, only the GD metrics of DTLZ1 and DTLZ3 of MOEA/D-oDE have the best performance.We can conclude that MOEA/D-oDE has some certain advantages in solving the many-objective optimization problems, as well as MOEA/DD is good at solving these kinds of problems.Nevertheless, MOEA/D-AWA could not obtain very good solutions.

Parameters' Sensitivity Analysis
Since the scaling factor F in DE is an important element, many previous attempts have been made to improve the convergence speed of DE by tuning the F [22,32].Due to us having selected the parameter set {0.3, 0.5} for the experiments, we would like to test other parameter combinations for comparison.Here, we select the F1 with four objectives to do the comparative experiments.
We choose six parameter combinations for the comparison study:{0.3,0.4}, {0.4,0.5}, {0.4,0.6}, {0.5, 0.6}, {0.6, 0.9} and {0.9, 1}.Table 8 shows the IGD metric results of all combinations of F1 with four objectives.We can find that the best IGD metric value of these seven groups is the {0.3, 0.4}; otherwise, the worst is the {0.9, 1}.Meanwhile, all of these IGD metric values are stable.From these experiments, we can find that, when the value of the factor F transforms evenly and slowly, it has no great influence on the consequences of the algorithm for F1 with four objectives.More attention to this part needs to be given in the future.

Conclusions
In order to solve the many-objective optimization problems well, an improved MOEA/D with optimal DE schemes, called MOEA/D-oDE, is proposed in this paper.MOEA/D-oDE under the framework of MOEA/D decomposes a many-objective problem into a set of single objective subproblems, and each subproblem is handled with evolutionary operators.All of these subproblems can be solved in parallel.
MOEA/D-oDE has two distinguishing points from MOEA/D.First, a new combined decomposition method named the WST approach is introduced and employed in MOEA/D-oDE.
The WST approach combines the advantages of the weighted sum and the weighted Tchebycheff approach.It focuses not only on one single aspect, but also the entire direction of objectives.Second, we design an optimal DE scheme, which combines two DE crossover operators and two scaling factors F to generate and select the best child solution to do the a posteriori computing in each subproblem.In the experimental studies, MOEA/D with the WST method is compared with MOEA/D with the weighted sum approach and the Tchebycheff approach in DTLZ1-DTLZ4 with 4-6 objectives firstly, which reflects the effectiveness of the new decomposition method.Then, MOEA/D-oDE is tested on three sets of six continuous test instances with 4-6 objectives, and the results obtained by MOEA/D-oDE are compared with those obtained by MOEA/D and NSGA-II.Experimental results indicate that MOEA/D-oDE in solving the problems with a higher dimension of objectives outperforms MOEA/D and NSGA-II in most situations.In the end, the comparison with MOEA/DD and MOEA/D-AWA on DTLZ1-DTLZ4 with four objectives is made to reflect the performance of MOEA/D-oDE from another point of view.Additionally, several sets of parameter sensitivity experiments were carried out.In a word, to some extent, MOEA/D-oDE can solve the many-objective optimization problems well.
In the future, we plan to research how to make the DE's trial vectors obtain the best child solutions dynamically.Additionally, the multi-objective discrete problems, such as the multi-objective TSP (travelling salesman problem) problems, are also our research targets.It is also worthwhile to research new methods to solve the many-objective optimization problems.

Figure 1 .
Figure 1.Illustration of the improvement regions for the (a) weighted sum approach and the (b) weighted Tchebycheff approach.

28 Step 3 :
Stopping criteria 29If the stopping criteria is satisfied, then stop, and output EP.Otherwise, go to Step 2.

Figure 3 .
Figure 3. Evolution of the mean of IGD metric values for four-objective test problems.

Figure 4 .
Figure 4. Evolution of the mean of IGD metric values for five-objective test problems.

Figure 5 .F2Figure 6 .F2Figure 7 .
Figure 5. Evolution of the mean of IGD metric values for six-objective test problems.

F2Figure 8 .
Figure 8. Box plots of the IGD metric values based on 20 independent runs for six-objective test problems.

Table 4 .
Statistics of the IGD metric obtained by MOEA/D-oDE, MOEA/D and NSGA-II on the test problems.

Table 5 .
Statistics of the GD metric obtained by MOEA/D-oDE, MOEA/D and NSGA-II on the test problems.

Table 8 .
IGD metric value statistics for F1 with 4 objectives on different parameters.