A New Hybrid Evolutionary Algorithm for the Treatment of Equality Constrained MOPs

: Multi-objective evolutionary algorithms are widely used by researchers and practitioners to solve multi-objective optimization problems (MOPs), since they require minimal assumptions and are capable of computing a ﬁnite size approximation of the entire solution set in one run of the algorithm. So far, however, the adequate treatment of equality constraints has played a minor role. Equality constraints are particular since they typically reduce the dimension of the search space, which causes problems for stochastic search algorithms such as evolutionary strategies. In this paper, we show that multi-objective evolutionary algorithms hybridized with continuation-like techniques lead to fast and reliable numerical solvers. For this, we ﬁrst propose three new problems with different characteristics that are indeed hard to solve by evolutionary algorithms. Next, we develop a variant of NSGA-II with a continuation method. We present numerical results on several equality-constrained MOPs to show that the resulting method is highly competitive to state-of-the-art evolutionary algorithms.


Introduction
In many applications, one is faced with the problem that several objectives have to be optimized concurrently. One main characteristic of such multi-objective optimization problems (MOPs) is that their solutions sets typically form objects of dimension k − 1, k being the number of objectives in the problem. Multi-objective evolutionary algorithms (MOEAs, e.g., [1,2]) have caught the interest of many researchers for the treatment of such problems as they have shown their efficiency both on academic and real-world problems. The population-based approach of MOEAs allows computing finite-size approximations of the entire solution set in one single run of the algorithm. Further, these methods are of global nature and require just minimal assumptions on the model (e.g., they can be work without gradient information).
In this work, we argue that a hybridization of a recently proposed Pareto Tracer (PT, [11]) with MOEAs leads to a new solver for continuous constrained MOPs that is fast and reliable. To this end, we propose and discuss here three equality constrained MOPs to complement the comparisons. Next, we propose -NSGA-II/PT that is a hybrid of a variant of the well-known and widely used NGSA-II and PT. PT is currently one of the fastest multi-objective continuation methods that allow us to trace solutions along the Pareto set/front, can be applied to higher dimensions, and reliably handles constraints. However, as all continuation methods, it is of local nature. We demonstrate the strength of the novel approach by showing results on selected benchmark problems and comparisons against some state-of-the-art MOEAs. A preliminary study of this work can be found in [12] which is restricted to the treatment of bi-objective problems, and where the discussion and comparison of the novel method is reduced.
This paper is structured as follows: in Section 2, we shortly recall some required background. In Section 3, we propose three new equality constrained MOPs that we will use for our comparisons. In Section 4, we propose the hybrid evolutionary algorithm -NSGA-II/PT. In Section 5, we present results on some benchmark problems including a comparison to the performance of related algorithms. In Section 6, we finally conclude our work and discuss possible further steps that can be made in this line of research.

Multi-Objective Optimization Problem (MOP)
In the sequel, we will consider continuous MOPs where F(x) = ( f 1 (x), . . . , f k (x)) T defines the objective functions. Q denotes the domain of (1), Q := {x ∈ R n : h i (x) = 0 ∧ g j (x) ≤ 0}. We say that y ∈ R n is dominated by x ∈ R n (in short x ≺ y) if f i (x) ≤ f i (y), i = 1, . . . , k, and it holds f j (x) < f j (y) for a j ∈ {1, . . . , k}. Else we say that y is non-dominated by x. x ∈ Q is said to be (Pareto) optimal or simply optimal if there exists no y ∈ Q with y ≺ x. The Pareto set P Q is the set of all optimal points, and its image F(P Q ) is called the Pareto front. We have to assume that the gradients of all functions (objective and constraints) can at least be approximated in order to apply PT.

Related Work
The development of solution tools for the constrained MOP described in the previous subsection has generated a recent interest, mainly within the evolutionary computation community. This subsection provides a review of previous works related to this field, focusing on pure evolutionary techniques and on hybrid strategies as well.
Regarding MOEAs, two constraint handling mechanisms, classically used in the single-objective optimization framework, are adapted in [13] to MOEA/D [14]. The first one is stochastic ranking (SR, [15]), which accounts for the need of directing the search according to both feasibility and objective value. Therefore, when comparing two solutions and at least one of them is infeasible, the comparison criterion is the objective function value with probability p f , while, with probability 1 − p f , individuals are distinguished through the overall constraint violation φ (where for any solution x, φ(x) = ∑ p i=1 |h i (x)| + ∑ p j=1 max{0, g j (x)}). On the other hand, the constraint-domination principle (CDP) is a multi-objective extension [16] of Deb's feasibility rules, where feasible solutions are compared according to dominance. The computational experiments, performed on the CTP-series [17] and CF-series [18], demonstrate that MOEA/D-CDP consistently outperforms MOEA/D-SR concerning hypervolume, inverted generational distance (IGD) and set coverage. It also performs reasonably well when compared with IDEA [19] and DMOEA-DD [20].
Besides, the Infeasibility Driven Evolutionary Algorithm (IDEA) [19] is a modification of the classical CDP-based NSGA-II [16] that enforces the participation of infeasible solutions through a parameter α, which represents the fraction of the current population allocated for those solutions. During the selection step, the pool combining parent and offspring populations is first divided into two (feasible and infeasible) sets. Then, non-dominated ranking is applied to both these sets, considering a function of the constraint violations as an additional objective for the infeasible set. Then, if N is the population size, the α · N infeasible solutions with the best-ranking values, as well as the (1 − α) · N feasible solutions with the best ranks, are included into the next population. Note that the constraint violation function used as an additional objective for infeasible solutions is computed as the sum of the ranking of the solutions, sorted in increasing order of the magnitude of constraint violation for each constraint (instead of the number of violated constraints formerly used [21]). The experimental results conducted over some test functions of the CTP-series [17] demonstrate that IDEA consistently outperforms CDP-NSGA-II.
Another adaptation of MOEA/D for handling constrained MOPs is introduced in [22] where, for each solution, a modified function for the overall constraint violation accounts for the number of active constraints, in addition to the simple constraint violation. The mean value of this modified function over the population is weighted by the ratio of feasible individuals in the population, in order to produce a threshold on the allowed amount of constraint violation. Solutions that are within this threshold are considered as feasible and compared in terms of their objective values. Furthermore, a gradient-based local search is periodically invoked in order to repair infeasible solutions. The resulting algorithm, tested on some of the CTP-series test functions [17], performs similarly or better than NSGA-II with CDP.
In [23], the -constraint technique originally developed for single-objective optimization is extended for the solutions of MOPs. This strategy, proposed in [24], consists in relaxing the tolerance level on constraints up to a value . Thus, when two solutions have an overall constraint violation lower than , they are both considered as feasible and compared in terms of their objective value. In [24], the value of is monotonically decreased according to a polynomial function, until some generation T c . From then on, is set to 0 in order to narrow the search on the feasible space. In [23], the authors allow increasing the level when the ratio of feasible solutions is greater than a threshold value, to promote exploration. This strategy, embedded in MOEA/D, is compared with MOEA/D-CDP [13] and with the original -constraint mechanism (decreasing pattern), over a set of nine constrained MOPs introduced earlier by the same authors [25]. Furthermore, this strategy is also compared with classical MOEAs (either dominance or decomposition-based) in a later work [26], over the CTP [17] and CF series [18], using IGD as a performance indicator. In any case, the MOEA/D-IEpsilon algorithm outperforms all its contenders, except IDEA [19], which obtains similar performance levels.
More recently, the same authors developed a two-stage (Push and Pull Search, PPS) procedure, which first solves the unconstrained MOP (solutions are pushed towards the unconstrained Pareto front) and, in a second stage, include constraints to modify the first (unconstrained) approximation and identify the constrained Pareto front (solutions are pulled from infeasible regions towards the feasible space). The switching criterion between both phases is based on no-evolution of the identified ideal and Nadir points. During the "push stage", the canonical MOEA/D (with the Tchebycheff scalarizing function) is employed, while a modified -constraint technique is applied in the second one (still with MOEA/D) to find feasible solutions. A specific decreasing scheme for the level is adopted, where exponential or polynomial (as in [24]) decrease can be used, depending on the feasibility ratio. Tested over a 14 functions benchmark earlier proposed by the same authors [25], the resulting PPS-MOEA/D outperforms some classical MOEAs quoted in this section but requires tuning many parameters.
As the last example of MOEA-based solution procedures, an innovative idea is introduced in [27], where MOEA/D is modified in such a way that two solutions are assigned to each weight vector. The aim is having one individual on each side of the feasibility boundary (one feasible and one infeasible), in order to focus the search on this region where the Pareto-optimal solutions might lie. The consequences of this working mode are: (i) the doubled size of the neighborhood of each weight vector for offspring generation (since each neighboring weight vector has two associated solutions); (ii) for solution replacement, the created offspring is now compared to two individuals, in terms of both the scalarizing function used within MOEA/D and the overall constraint violation. In this bi-objective space, dominance is used to select two surviving individuals among the three contenders. If the three are non-dominated, that with the larger constraint violation is discarded, while if one solution dominates the two others, the former is the only one to survive. The algorithm is successfully compared with MOEA/D-CDP over several functions of the C-DTLZ test suite [28]. However, for CTP and CF series [17,18], this Dual-grid MOEA/D is outperformed by IDEA [19] or MOEA/D-IEpsilon [26].
Besides, as for single-objective optimization, hybrid strategies have been adapted for solving MOPs in recent years. In general, these so-called memetic algorithms combine a global search engine (a MOEA) with some local search technique based on exact algorithms. In this framework, proposals differ one from another according to:

•
The kind of local search technique used, which may be gradient-based (quasi-Newton in [29,30], or sequential quadratic programming in [31]) or direct search for nonlinear problems (Nelder and Mead's algorithm in [32]).

•
The problem reformulation on which local search is applied, which may be based on -constraint [33] or a scalarization of the MOP [31].

•
The hybridization scheme, which can consists on seeding the initial population of the MOEA [34], interleaving global and local search steps by applying local search to some selected individuals of the population [33] or periodically (every t generations) [35], or using the non-dominated solutions obtained by the exact algorithm to reconstruct the whole Pareto front [36].
However, to the best of our knowledge, most of these hybrid strategies were applied to unconstrained MOPs through classical test suites (ZDT, DTLZ, WFG) and there is almost no proposal for dealing with constraints, particularly equality constraints.
There already exist some strategies for deal with equality constraints: for instance, if all equalities are linear, one can use orthogonal projections in order to obtain feasible points near to a given candidate solution ( [8]). For vehicle routing problems, several repair mechanisms can be found in [37][38][39]), and in [4,40] such mechanisms can be found for portfolio selection problems. Finally, [41] proposes a repair mechanism that makes use of first-order Taylor approximations of the constraints.

Test Suites for Constrained MOPs
Benchmark problems are important to assess the performance of solution techniques. In literature, some of such test suites can be found for the case of constrained MOPs. Most of these problems, however, contain only box-constraints. Established test suites whose domains are more complex are the CTP problems ( [17]) and the CF problems (see [18]). More recently, the C-DTLZ ( [28]) and the LIRCMOP problems ( [25]) have been proposed. Remarkably, not one of these test problems contains an equality constraint. Thus, the common operating mode proposed by most authors to handle equalities (i.e., dividing the constraint into two inequalities) has never been seriously tested. Indeed, this methodology does not work in practice since it is very unlikely that a solution can respect both inequalities. As a consequence, the first feasible solution identified attracts the rest of the population, causing drastic diversity losses. This is why the specific treatment of equality constraints represents a relevant issue on its own, as shown by the present study.
In [42], the CZDT functions are proposed that contain equality constraints and that are derived from the well-known ZDT functions ( [43]). We will use this test suite for our comparisons. The particularity of this suite, however, is that the Pareto sets of the constrained problems are identical to the Pareto sets of the respective unconstrained problems. To further test the ability of MOEAs to handle constraints, we will in this paper propose three new equality constrained MOPs where the inclusion of the equality constraint(s) has an influence on the location of the Pareto sets.

Pareto Tracer (PT)
PT is a continuation method that is capable of performing a movement along the set of Karush-Kuhn-Tucker (KKT) points of a given MOP starting from an initial KKT point [11]. The algorithm can handle both equality and inequality constraints and is applicable in principle to any number of objectives. The main steps of PT for equality constrained MOPs are for convenience of the reader briefly described in Appendix A.

Proposed Test Problems
Here we propose two bi-objective test problems, Eq1-ZDT1 and Eq2-ZDT1, and one three-objective problem, Eq-Quad. All problems are scalable in the number of decision variables, and for all problems the inclusion of the equality constraint(s) has an influence on the location of the Pareto set.

Eq1-ZDT1
The original ZDT1 is a bi-objective problem with box constraints that can be defined for an arbitrary number n of decision variables. Meanwhile, the proposed Eq1-ZDT1 problem is stated as follows As we can see, the Eq1-ZDT1 (2) is also a scalable bi-objective problem in the number of variables, which changes the box constraints by an equality constraint (with the implicit inequality constraint that x 1 ≥ 0 so that f 2 is defined). The constraint of this problem (3) defines a kind of "hyper-cylinder", where the variables x 1 and x 2 are placed on a circle, while the remaining variables x i , i = 3, . . . , n can take any value (see Figure 1).
In the following, we will provide the Pareto set for Eq1-ZDT1. For this, we need the set P h ⊂ Q defined as Theorem 1. Let P h be defined as in (5) and n = 30. Then the subset P Eq1 ⊂ P h given by where γ ≈ 0.977336 is the Pareto set of Eq1-ZDT1.

Proof.
(a) First, we prove that y ∈ Q \ P h such that y ≺ x, where x ∈ P h .
For the first objective we have that f 1 (y) = f 1 (x) = y 1 .

2.
For the second objective we have that Then, Finally, which contradicts the hypothesis. Thus y ∈ P D \ P h | x ≺ y with x ∈ P h .
(b) Now we show that the points P Eq1 are not dominated by each other and they dominate all the points in the set P h \ P Eq1 .
Let x, x ∈ P h such that x 1 = x 1 and |x 2 | < |x 2 |. Notice that we can express x 2 in terms of x 1 as and it is clear that the points of the form (x 1 , 0. Computing the derivative of Equation (10) we have The derivative of f 2 has only one root at is monotonically decreasing, and consequently, points in P Eq1 are not dominated by each other.
Finally, by (a) and (b) we have that P Eq1 is the Pareto set of Eq1-ZDT1.
To obtain γ for the formulation of the Pareto set we needed to consider f 2 which depends on n.
The proof is analog for other values of n with changing value of γ. Some of these values can be found in Table A1 of Appendix D.

Eq2-ZDT1
Via adding box constraints to Eq1-ZDT1, we can define the Eq2-ZDT1 problem as follows where h(x) and g(x) are defined as in (3) and (4), respectively. Next, we will provide the analytical Pareto set for Eq2-ZDT1.
Theorem 2. For n = 30, x ∈ R n , the Pareto set for the Eq2-ZDT1 problem (see Figure 2) is given by where Proof.
(a) Let P h be defined as in (5), and P Eq1 as in (6) and then first part of this proof is analogs the previous analysis for Eq1-ZDT1.
As second step, we need to remove all the points in P Eq1 that do not satisfy the box constraints. In particular, as x i = 0, i = 3, . . . , n, we focus on x 1 and x 2 . For x 1 , x 2 ∈ P h , we have that , some values of x 2 do not satisfy the lower bound.
We can express x 1 as follows: thus, for x 2 = 0 we can find the values of x 1 that define I 1 and I 3 via: After removing the non-feasible points from P Eq1 we have a gap in Pareto set/front. Now, notice that, some points x ∈ P h : For this we consider: where, Notice Finally, by (a) and (b) we have that P Eq2 is the Pareto set of Eq2-ZDT2.
As we can observe, the values of γ and η depend on n (see Theorems 1 and 2). We refer to the Appendix D for these values for other dimensions of the decision variable space. See Figure 2 for Pareto set and front of Eq2-ZDT2.

Eq-Quad
Finally, we propose a modification of the problem taken from [11] which has three quadratic objectives and two equality constraints: where x ∈ R n , k = 3, and a  Figure 3 shows the constraints and the Pareto set for n = 3. As it can be seen, the Pareto set consists of two connected components that can be both expressed by curves (and which are hence 1-dimensional).

Proposed Algorithm ( -NSGA-II/PT)
This section is devoted to the description of the hybrid algorithm proposed in this work. As abovementioned, the Pareto Tracer is a continuation strategy that is able to efficiently perform moves along the Pareto front of a MOP. However, this reconstruction process is carried out locally, which involves that PT must be provided with a reduced set of relevant approximated solutions, i.e., with a reasonably good convergence and dispersion over the front. Therefore, a multi-objective evolutionary algorithm, based on a modified implementation of NSGA-II [16], is developed here as the first stage of the hybrid algorithm, in the aim of producing this first set of promising solutions. This latter subsequently serves as an input of PT, which refines it to produce the final approximation of the Pareto front/set. The two stages of the resulting -NSGA-II/PT are presented in detail in what follows.

First Stage: Rough Approximation via -NSGA-II
The first stage of the hybrid algorithm presented in this work consists in determining a rough approximation of the Pareto front via a MOEA. Each approximated solution will subsequently feed the Pareto Tracer, which generates local reconstructions of the real Pareto front and combines them to finally obtain the whole Pareto set. Therefore, the MOEA used in the first step should meet the following characteristics: the number of solutions in the roughly approximated set is small, in order to reduce the computational burden of the local search (PT). Indeed, in case of a completely connected front, one single approximated solution might allow us to build the entire Pareto front. -the MOEA should promote diversity, since the rough approximation produced should cover all the extent of the Pareto front and identify all the different components, in case of a disconnected front. -the MOEA must be able to handle equality constraints. As mentioned before, a severely constrained problem might cause diversity issues that should be overcome by the MOEA.
Note that the two first features are conflicting, since the number of elements of the approximated front should be small enough to avoid unnecessary computations. Nonetheless, there must be enough approximated solutions to ensure the identification of all the components, in case of disconnected Pareto front. To deal with 2 or 3 objective problems such as those treated in this work, we observed that 20 points in the rough approximation represents a good trade-off between these two requirements. This means that the MOEA should work either with small populations or maintaining a small external archive.
Regarding the search engine and diversity preservation issue, two classical dominance and decomposition-based algorithms (NSGA-II and MOEA/D, respectively) were initially investigated. However, preliminary computations led to the observations that, first, the use of small populations deteriorated the performance of MOEA/D and, even with larger populations (and a small archive to be returned), diversity remained an issue. Therefore, NSGA-II was used in this study as baseline algorithm.
In the proposed implementation, NSGA-II does not necessarily use small populations, but the final population is reduced to 20 at the end of the search (according to crowding distance). Regarding constraint handling, the constraint-domination principle CDP generally used within NSGA-II does not obtain good results on the equality constrained problems tackled here. Indeed, any technique establishing the superiority of feasible solutions over infeasible ones will face severe diversity issues, due to a premature convergence towards the first feasible solution found.
As a simple way to modify this operating mode, the -constraint strategy proposed in [24] is adopted here. Note that, despite this method was previously integrated within MOEA/D in [23], to the best of our knowledge, there is no reference to NSGA-II working with -constraint for solving constrained MOPs. As explained earlier, the -constraint strategy relaxes the tolerance level on constraint violation up to a value , i.e., a solution x with an overall constraint violation φ(x) ≤ is viewed as feasible and, thus, evaluated and compared in terms of its objective value. Then, is smoothly reduced according to a specific decreasing scheme. This technique allows solutions with a small (but positive) overall constraint violation to be compared with feasible individuals in terms of convergence (through dominance-based sorting) and spread uniformity (thanks to the crowding distance). Accordingly, the following operator is used to compare two solutions x and y: where denotes the constraint violation. At the beginning of the run, the first value of , denoted as (0), is set in such a way that at least some solutions are " -feasible". More precisely, (0) is the overall constraint volation of individual x θ , which is the θ-th solution in the first population sorted in descreasing order of φ: (0) = φ(x θ ). For subsequent generations, the decreasing schedule originally proposed in [24] is adopted here: is an exponentially decreasing function of the generation number, until a critical generation T c is reached. From then on, is set to min : where t is the generation number, cp is a parameter controlling the decreasing speed and (0) is the constraint relaxation level at the first generation. Note that other decreasing schedules, such as that embedded within MOEAD/D-IEpsilon [23], have been tested, but better results were obtained using Equation (21). Finally, in order to promote diversity, another parameter is introduced to bias the parent selection operator, which is based on a tournament implementing the -constraint strategy described in Equation (19). However, as suggested in [44], the result of the tournament is respected only with probability p f ; else (i.e., with probability 1 − p f ), the winner is randomly chosen. Empirical results proved that this small modification sometimes allows significant improvements in the algorithm performance. The whole process is shortly described in Algorithm 1.

Second Stage: Refinement via PT
The main task on this stage is to process the resulting archive P provided by -NSGA-II. The main challenge here is to avoid unnecessary effort, e.g., via computing non-optimal KKT points along local Pareto fronts that are already dominated by previously computed solutions. While this is relatively easy for k = 2 objectives (see [12]), this task becomes more complicated with increasing k. The following procedure works for general k.
Before PT can be executed, the following post-processing has to be done on P: 1.
Let τ be the desired minimal distance between two solutions in objective space. In this first step, go over P and eliminate elements that are too close to each other (if needed). This leads to the new archiveP.

2.
Apply the Newton method (A11) to all elements ofP. Remove all dominated solutions, and elements that are too close to each other as in the first step. This leads to the archiveP. 3.
To obtain a "global picture" of the part of the Pareto front that will be computed by PT, construct a partition of a potentially interesting subset S of the image space into a set of hyper-cubes (or k-dimensional boxes) with radius ≈ τ. This partition can easily be constructed via using a binary tree whose root represents S (see [45] for details, where, however, the partition is used in decision variable space). S is a box that is constructed out ofP as follows: denote by m i and M i the minimal and maximal value of the i-th objective value of all elements inP, respectively. Then the i-th element of the center of S is set to (m i + M i )/2 and its i-th element of the radius to (M i − m i )/2. In the computations, we will only allow a storing one candidate solution within each of these boxes in the archive A to guarantee a spread of the solutions.
Algorithm 1 -NGSA-II P ← pop_init() Evaluate each individual x i ∈ P to obtain F(x i ) and φ(x i ) Compute (0) and set = (0) for t ← 1 to MaxGen do P ← crossover(P) Parent selection through tournament and -constraint P ← mutation(P ) Then, in the first step the element p (1) ∈P is chosen as the starting point for PT, where f 1 (p (1) ) = m 1 . An external archive A will be created that will be the reference for PT and that will be the set of solutions that will be returned after the application of -NSGA-II/PT. At the beginning, it is A := {p (1) }. In parallel, a box collection C will be created that contains all the (unique) boxes out of the above partition that contain all the elements of A. In the first step, C will set to the box that contains p (1) . The application of PT leads to a sequence of candidate solutions x i , i = 1, . . . , s. For each x i the following steps are performed: if x i is dominated by any element of A, then PT the current application of PT is stopped.

2.
else, it is checked if the unique box that contains x i is already contained in C. If this is not the case, add this box to C and add x i to A. Else, decline x i and proceed with x i+1 .
After this, take the element fromP\{p (1) } with the smallest value of f 2 as the starting point for PT and proceed as above. Proceed in this way, sorting cyclic according to each objective, until all elements p ∈P have been chosen as starting points for PT.

Numerical Results
Here we present some numerical results that compare the behavior of a well-known mathematical programming technique and some state-of-the-art MOEAs in the selected test problems. As test functions we have chosen to take the three test problems proposed above, Eq1-Quad from [11], the CZDT test suite as well as a modification of a problem from Das and Dennis problem (D&D) stated in [11]. First, we test the normal boundary intersection (NBI) method in the selected suite. With this experiment we want to show why the use of MOEAs is convenient, even more the need for a special hybrid algorithm which is capable of solving these type of MOPs. The second experiment is the comparison between state-of-the-art MOEAs against the proposed -NSGA-II/PT. Here, a point x is considered to be feasible if h(x) < ε, where we have taken ε = 1e − 04, which is common in evolutionary computation.

Performance Assessment
The multi-objective solvers were evaluated by adopting two performance indicators taken from the literature.
The ∆ p indicator [46,47] can be viewed as an averaged Hausdorff distance between an approximation set and the real Pareto front of a MOP. This indicator is defined by slight modifications of the indicators Generational Distance (GD) [48] and Inverted Generational Distance (IGD) [49]. Formally, the ∆ p indicator can be written as follows.
Let P = { x 1 , . . . , x |P| } an approximation and R = { r 1 , . . . , r |R| } be a discretization of the real PF of a MOP, then ∆ p (P, R) = max{GD p (P, R), IGD p (P, R)}, where GD p (P, p j 1 p , and where d i andd j are the Euclidean distance from x i to its closest member r ∈ R, and the Euclidean distance from r j to its closest member x ∈ P, respectively. Here we have chosen p = 2. The ideal indicator value is 0, and a low ∆ p value indicates a good approximation of P.

Feasibility Ratio
The feasibility ratio (I F ) indicator refers to the ratio of the number of feasible solutions found in the final approximation P given by a MOEA. Mathematically, this indicator can bee stated as follows.
where P f denotes the number of feasible solutions in P and |P| represents the cardinality of the population P.

Solving Equality Constrained MOPs with Mathematical Programming Techniques
In this section, we test the NBI technique and apply it to the selected test problems. First, we compute the extreme points of the CHIM by separately optimizing each objective. We take the center of the box defined by each MOP as the initial solution of the optimization process. Then, we compute the CHIM using the previously obtain solution, here we set a partition of 100 well-distributed points. Finally, we solve the NBI subproblem taking into a count each NBI weight. We compute the extreme points of the MOP and the NBI subproblem via fmincon Matlab function. Table 1 shows the ∆ 2 value and the computational cost, in terms of function evaluations, for each problem. Feval contains the global number of function evaluations required by NBI technique without the cost of each objective optimization process. Feval f i column contains the number of function evaluations that were required to optimize each objective. Figure A1 of Appendix B shows NBI results for each test problem. Each figure shows the real Pareto front, the computed CHIM and NBI solution.
From Table 1 we observed that in CZDT1, CZDT2, CZDT4 and Das and Dennis ∆ 2 value is close to zero which means that NBI adequately solves these problems, on the other hand, the number of function evaluations needed is very high. In the remaining test functions, neither the performance indicator value nor the number of function evaluations is good. Thus, NBI is not capable of solving these MOPs.

Solving Equality Constrained MOPs with MOEAs
In this section, the proposed approach was compared against four state-of-the-art MOEAs that incorporate different constraint-handling strategies in their environmental selection procedures. Also we include a comparison with a two-stage algorithm that combines the above mention -NSGA-II (see Section 4.1) and NBI technique.
• NSGA-II. The popular non-dominated sorting genetic algorithm II [50] was adopted in our comparative study. NSGA-II employs a binary tournament-based on feasibility in the mating selection procedure. In order to determine the next generation, the crowding comparison operator considers the feasibility of solutions. In our study, NSGA-II was performed using the standard parameters given by its authors, i.e., Pc = 1, Pm = 1/n, η c = 20, η m = 20. • GDE3. The third evolution step of generalized differential evolution [51] was also adopted in our experimental analysis. GDE3 introduces the concept of constraint-domination explained before to discriminate solutions. GDE3 was employed using CR = 1 and F = 0.5. • cMOEA/D-DE. We also adopted the first version of the multi-objective evolutionary algorithm based on decomposition for constraint multi-objective optimization [52]. cMOEA/D-DE utilizes a penalty function in order to satisfy the constraint of the problem. The penalty function is straightforward added to the scalarizing function employed by MOEA/D-DE [53] to approximate the PF of a constrained MOP. cMOEA/D-DE was employed using CR = 1, F = 0.5, T = 0.1 × N , n r = 0.01 × N , Pm = 1/n, η = 20, s 1 = 0.01, and s 2 = 20. In the above description, N represents the population size which is implicitly defined by the number of subproblems defined by the decomposition-based MOEAs (i.e., cMOEA/D-DE and eMOEA/D-DE). Such subproblems were generated by employing the simplex-lattice design [55] and using the penalty boundary intersection approach (PBI) with a penalty value θ = 5, such as it was suggested by [14]. Therefore, the number of weight vectors is given by N = C M−1 H+M−1 , where M is the number of objective functions. Consequently, the setting of N is controlled by the parameter H. Here, we use H = 99 (for two-objective problems) and H = 23 (for three-objective problems), i.e., 100 and 300 weight vectors for problems having two and three objectives, respectively.
For each MOP, 30 independent executions were performed with each MOEA.

Analysis
Our experiments have shown that the new hybrid needs between 15, 000 and 17, 000 function evaluations (FEs) to obtain good results for the bi-objective problems and 110, 000 to 150, 000 FEs for the three-objective problems. In order to make the comparison fair, we have set a final budget of 20, 000 FEs for all the selected MOEAs on the bi-objective problems and 150, 000 FEs for the three-objective problems. For -NSGA-II/PT we have split the budget for the bi-objective problems into 15, 000 FEs for -NSGA-II and the remaining 5000 FEs for PT (and 100, 000 plus 50, 000 FEs for three-objective problems). For the realization of PE, we have used Automatic Differentiation to compute the gradients which allows us to express the cost of the continuation method in terms of FEs. For all experiments, we have executed 30 independent runs. We performed the Wilcoxon Test as statistical significance proof to validate the results. For this, we consider the value α = 0.05. Based on the test results, for the comparison between -NSGA-II/PT and any of the chosen MOEAs the symbol "↑" means that the obtained results are statistically significant. The symbol "−−" means that no ∆ 2 value could be computed for the algorithm. This was the case if a MOEA detected no feasible solutions for at least 25 runs. Figures 4 and 5 show the results for Eq1-Quad and Eq2-Quad, respectively. The reader is referred to Appendix C for more figures. The theoretical PF is marked with dots (.) while approximations from the algorithms are marked with triangles ( ). The Pareto fronts for the other test problems have been omitted due to space limitations, however, Table 2 shows the indicator values, ∆ 2 and I F , for all test problems. Smaller values of ∆ 2 correspond to better qualities of the approximated solution, while larger values for I F indicate more feasible solutions in the approximation. The best values are displayed in bold for each problem and each indicator. We can see that the new hybrid algorithm significantly outperforms the other algorithms in nine out of the ten test functions considering a compromise between ∆ 2 value and the number of the needed function evaluations. Note that in CZDT1, CZDT2, and CZDT6, -NSGA-II/NBI has better performance that our proposal but it needs more function evaluations in order to achieve these results. In some cases, the MOEAs are not able to find any feasible solution within the given FE budget. -NSGA-II/PT, however, loses against c-MOEA/D on Eq2-Quad. This is due to the fact that the real Pareto front is disconnected, and in most of the runs, the first stage of our proposal was not able to find adequate solutions near both components. Therefore, in the second stage, we were in most cases only able to compute one of the two components. However, note that all the solutions of the final approximation are feasible in all the independent runs.

Conclusions
Here the have addressed the treatment of equality constrained MOPs. To this end, we have first proposed three such problems that have different characteristics and that can serve as future benchmark problems. Next, we have proposed a two-phase hybrid evolutionary algorithm, -NSGA-II/PT, which combines a variant of the well-known and widely used MOEA NSGA-II with the recently proposed continuation method, the Pareto tracer (PT). More precisely, the evolutionary algorithm, -NSGA-II, computes in a first step a small but ideally well-spread set of solutions around the Pareto front of a given MOP. In the next step, PT takes over to refine the given rough approximation. Empirical results on some benchmark functions that included a comparison against the state-of-the-art have demonstrated that this new hybrid evolutionary algorithm is highly competitive, and yields highly satisfying results using only a moderate budget of function evaluations.
For future work, it might be interesting to reduce the requirement of directly computing the gradient information, e.g., via utilizing approximation strategies like the one proposed in [41], and to apply the new hybrid to the treatment of real-world applications. For instance, our proposal could improve the results presented in [56], where the authors showed that some classic MOEAs have an unsatisfactory performance solving the portfolio problem with complex equality constraints. On the other hand, equality constraints also appear in some single-objective optimization problems (for instance, the development of computer-controlled material [57], the predictive scheduling [58], or the production system designing [59]), and more applications of our proposal could emerge from the extension of such problems to the multi-objective case.
If the ranks of W α and H are maximal, then ν µ is determined uniquely. In that case, it holds as well as For a given vector d ∈ R k , the direction ν d ∈ R k with Jν µ d = d can be computed using (A5): Since also (A4) has to hold, µ d it is determined via solving The direction vector d can now be chosen to steer along the Pareto front. For this, we have to choose d orthogonal to α: let α = QR = (q 1 , . . . , q k )R be a QR decomposition of α, then define and select the µ d i 's via solving (A7). Then, the vectors ν µ d i are tangential to P Q , and the d i 's are the respective directions in objective space.
Using ν µ , we can now compute the predictor p i with step size t i as follows: τ > 0 is a user specified value that represents the desired distance between the images of two consecutive solutions in objective space. In the next step, a Newton method is used to project p i to the KKT set. The Newton direction is hereby chosen as the solution of min (ν,δ)∈R n ×R δ s.t. ∇ f i (x) T ν + 1 2 ν T ∇ 2 f i (x)ν ≤ δ, i = 1, . . . , k, h i (x) + ∇h i (x) T ν = 0, i = 1, . . . , p.

(A11)
Quasi-Newton techniques can be used to realize the algorithm without using Hessians. We will use this implementation here.  Figure A2 shows the numerical results obtained by -NSGAII/PT on all the ten considered test functions.