A Comparison of Archiving Strategies for Characterization of Nearly Optimal Solutions under Multi-Objective Optimization

In a multi-objective optimization problem, in addition to optimal solutions, multimodal and/or nearly optimal alternatives can also provide additional useful information for the decision maker. However, obtaining all nearly optimal solutions entails an excessive number of alternatives. Therefore, to consider the nearly optimal solutions, it is convenient to obtain a reduced set, putting the focus on the potentially useful alternatives. These solutions are the alternatives that are close to the optimal solutions in objective space, but which differ significantly in the decision space. To characterize this set, it is essential to simultaneously analyze the decision and objective spaces. One of the crucial points in an evolutionary multi-objective optimization algorithm is the archiving strategy. This is in charge of keeping the solution set, called the archive, updated during the optimization process. The motivation of this work is to analyze the three existing archiving strategies proposed in the literature (ArchiveUpdatePQ,ϵDxy, Archive_nevMOGA, and targetSelect) that aim to characterize the potentially useful solutions. The archivers are evaluated on two benchmarks and in a real engineering example. The contribution clearly shows the main differences between the three archivers. This analysis is useful for the design of evolutionary algorithms that consider nearly optimal solutions.


Introduction
Many real-world applications pose different objectives (usually in conflict) to optimize [1][2][3]. This leads to the proposal of a multi-objective optimization approach (MOOPmulti-objective optimization problem) [4][5][6][7]. In a posteriori multi-objective approach [8], after the MOOP definition and the optimization stage, a set of Pareto optimal solutions [9] is generated. The decision maker (DM) can then analyze, at the decision-making stage, the trade-off of the optimal alternatives for each design objective. This enables a better understanding of the problem and a better-informed final decision.
For the DM, it is useful to have a diverse set of solutions. Traditionally, diversity is sought in the objective space. However, obtaining a diverse set in the decision space also offers advantages [10]: (1) it enables the DM to obtain different (even significantly different) alternatives before the final decision; (2) it helps speed up the search, improving exploration, and preventing premature convergence towards a non-global minimum. In addition, the best solutions are sometimes too sensitive to disturbances, or are not feasible in practice [11][12][13][14]. In this scenario, the multimodal solutions or the nearly optimal solution set (also called approximate or -efficient solutions) plays a key role in enhancing the diversity of solutions. Two multimodal solutions are those that, being optimal, obtain the same performance. Nearly optimal solutions are those that have similar performance to optimal solutions. Generalizing, it can be considered that multimodal solutions are included in nearly optimal solutions. Nearly optimal solutions have been studied by many authors in the bibliography [15][16][17][18][19], have similar performance to optimal solutions and can sometimes be more adequate according to DM preferences (for instance, more robust [14]). Therefore, an additional challenge then arises: to obtain a set of solutions that, in addition to good performance in the design objectives, offer the greatest possible diversity.
However, considering all the nearly optimal solutions requires obtaining and analyzing a great number of alternatives and this causes two problems: 1.
It slows down the optimization process. In evolutionary algorithms, an archive (a set to store solutions during the execution) is required. The computational cost of the optimization process largely depends on the archive size. This is because to check for the inclusion of a new candidate solution in the archive, it is necessary to check the dominance (or −dominance) for each solution in the current archive. Many new candidate solutions are analyzed in an optimization process. Therefore, a large archive results in a significantly higher computational cost.

2.
The decision stage is made more difficult. The designer must choose the final solution from a much larger number of alternatives.
Therefore, it is necessary to reduce the set of optimal solutions obtained by the designer. In the literature, there are different algorithms aimed at finding nearly optimal solutions in multi-objective optimization problems. The multimodal multi-objective evolutionary algorithms (MMEAs [20]) are intended for multimodal optimization problems. Some of the MMEAs take into account nearly optimal solutions in the optimization process, but most of them do not provide these solutions to the DM. Furthermore, evolutionary algorithms with an unbounded external archive [21] can also be interesting to analyze these solutions. These unbounded external archives can be analyzed to obtain the relevant nearly optimal solutions.
One of the crucial points in an evolutionary multi-objective optimization algorithm is the archiving strategy (or archiver). An archiving strategy is the strategy that selects and updates a solution set, called the archive, during the evolutionary process. Some archivers have been studied previously [19,[22][23][24][25][26]. In this paper, we address the problem of discretization of the potentially useful alternatives. For this purpose, we compare different archiving strategies that aim to obtain the set of potentially useful nearly optimal solutions. An archiving strategy must take into account the decision space to ensure that the potentially useful nearly optimal solutions are not discarded.
For the comparison of the results in this real problem, we have chosen to embed the archiver in a basic evolutionary algorithm. First, to observe the impact of each archiver when incorporated into an evolutionary mechanism. In addition, second, because the computational cost associated with the objective functions of the real problem does not allow simulations on large numbers of points. Therefore, it is not feasible to test each archiver with a random or exhaustive search as has been done with the benchmarks.
These archivers have not been compared in the literature, so this work is useful for future designs of evolutionary algorithms that consider nearly optimal solutions or even to modify the archivers of the old evolutionary algorithms considering such solutions. Therefore, the purpose of the paper is: (1) to understand the properties of these archivers, (2) to provide an analysis for choosing one of these archivers and (3) to give ideas for designing new archivers. The design of these algorithms are currently open issues in this research area [18,20,27].
This work is structured as follows. In Section 2, a small state of the art on potentially useful nearly optimal solutions is introduced. In Section 3 some basic multi-objective backgrounds are presented. In Section 4, different archiving strategies to characterize the optimal and nearly optimal set are described. In Section 5 the MOOPs and the archivers comparison procedure are presented. The results obtained on the archivers are shown in Section 6. Finally, the conclusions are given in Section 7.

State of the Art
As discussed in the previous section, obtaining all nearly optimal solutions leads to problems. Considering only the most relevant solutions largely avoids the problems mentioned above. Not all nearly optimal solutions are equally useful to the DM. Therefore, if we manage to discard those that are less useful, we will reduce both mentioned problems. Let us see a graphic example to illustrate what we consider as potentially useful solutions. Suppose we have a MOOP with two design objectives and two decision variables (see Figure 1). Three solutions x 1 , x 2 and x 3 are selected. x 1 is an optimal solution (member of the Pareto set), and it slightly dominates the nearly optimal solutions x 2 and x 3 . x 1 and x 2 are very similar alternatives in their parameters (both belong to neighborhood 1 , the same area in the parameter space), while x 3 is significantly different (it belongs to neighborhood 2 ). In this scenario, x 2 does not provide new relevant information to the DM. This solution is similar to x 1 but with a worse performance in design objectives. Predictably, both will have similar characteristics, therefore the DM will choose x 1 since it obtains a better performance in the design objectives. However, x 3 does provide useful new information to the DM because it has a similar performance to the optimal ones and is in a different neighborhood. The solutions in neighborhood 1 could be, for example, not very robust or not feasible in practice. In this context, x 3 (and the solutions in neighborhood 2 ) could be a potentially useful solution due to their significantly different characteristics. It is possible, and often common, for the DM to analyze in the decision stage additional indicators/objectives not included in the optimization phase. Thus, the DM can assume a small loss of performance in the design objectives in exchange for an improvement in a new feature not contemplated in the optimization process. This analysis can decide the final choice in one way or another. In short, including solutions of neighborhood 2 increases the diversity with useful solutions and enables the DM to make a better-informed final decision.
x 1 On the left, the objective space is shown, and on the right, the decision space is shown. SET 1 is the Pareto optimal set and SET 2 is a potentially useful nearly optimal set. Therefore, the potentially useful nearly optimal solutions are those nearly optimal alternatives that differ significantly in the parameter space [28][29][30]. Thus, the new set must: (1) not neglect the diversity existing in the set of nearly optimal alternatives; (2) obtain the least number of solutions. To achieve both aims, it is necessary to employ an evolutionary algorithm that characterizes the set of solutions by means of a discretization which takes into account both the decision space and the objective space, simultaneously. A discretization that takes into account only the objective space can lead to the loss of significantly different nearly optimal alternatives in the decision space. This loss is a drawback because, as we have previously discussed, these alternatives are potentially useful. On the other hand, a discretization that takes into account only the decision space can lead to archives with a huge number of solutions [19] and cause the two problems previously mentioned.

Background
A multi-objective optimization problem can be defined as follows: x k ] is defined as a decision vector in the domain Q ⊂ k and f : Q → m is defined as the vector of objective functions . A maximization problem can be converted into a minimization problem. For each objective to be maximized max f i (x) = −min(− f i (x)) will be performed. The domain Q is defined by the set of constraints on x. For instance (but not limited to -any other constraints could be introduced in a general MOOP-): where x i and x i are the lower and upper bounds of x components. Consequently, the MOOP obtains a Pareto set P Q (see Definition 2). This set has solutions non-dominated by any other solution (see Definition 1) in Q.

Definition 1 (Dominance [31]). A decision vector x 1 is dominated by another decision vector
. This is denoted as x 2 x 1 .

Definition 2. (Pareto set P Q )
: is the set of solutions in Q that is non-dominated by any other solution in Q: P Q := {x ∈ Q| x ∈ Q : x x} Definition 3. (Pareto front f (P Q )): given a set of Pareto optimal solutions P Q , the Pareto front is defined as: In any MOOP, there is a set of solutions with objective values close to the Pareto front. These solutions receive several names in the bibliography: nearly optimal, approximate, or -efficient solutions. To formalize the treatment of the nearly optimal solutions, the following definitions are used: [32]): define = [ 1 , ..., m ] as the maximum acceptable degradation. A decision vector x 1 is − -dominated by another decision vector x 2 if f i (x 2 ) + i ≤ f i (x 1 ) for all i ∈ [1, ..., m] and f j (x 2 ) + j < f j (x 1 ) for at least one j, j ∈ [1, ..., m]. This is denoted by

Definition 5.
(Set of nearly optimal solutions, P Q, [30]): is the set of solutions in Q which are not − -dominated by another solution in Q: The sets defined P Q and P Q, usually contain a great, or even infinite, number of solutions. Optimization algorithms try to characterize these sets using a discrete approximation P * Q and P * Q, . In general, if such an approach has a limited set of solutions, the computational cost falls. However, the number of solutions must be sufficient to obtain a good characterization of these sets.
To compare the archiving strategies, it is useful to use a metric. Different metrics are used in the literature to measure the convergence of the outcome set. An example of these is the Hausdorff distance (d H [33][34][35] see Equation (3)). This metric is a measure of the distance between two sets. Therefore, d H can be used to measure convergence between the outcome set (or final archive) f (A) to the target set f (H) of a given MOOP (or archiving strategy). However, d H only penalizes the largest outlier of the candidate set. Thus, a high value of d H (A, H) can indicate both that A is a bad approximation of H and that A is a good approximation but contains at least one outlier. The d H is used by the archiver ArchiveU pdateP Q, D xy .
To avoid this problem, a new indicator appears in the literature: the averaged Hausdorff distance ∆ p (the standard Hausdorff distance is recoverable from ∆ p by taking the limit lim p→∞ ∆ p = d H ). This metric (with 1 ≤ p < ∞) assigns a lower value to sets uniformly distributed throughout its domain. ∆ p is based on the known generational distance metrics (GD [36] and represents how "far" f (H) is from f (A)), and inverted generational distance (IGD [37] represents how "far" f (A) is from f (H)). However, these metrics are slightly modified in ∆ p (GD p and IGD p , see Equation (5)). This modification means that the larger archive sizes and finer discretizations of the target set do not automatically lead to better approximations under ∆ p [34]. ∆ p measures the diversity and convergence in the decision and objective spaces. In this work we use ∆ p with p = 2 (as in [18,38]) so that the influence of outliers is low. where To use ∆ p it is necessary to define the target set H with which to compare the final archive A obtained by the archivers. The target set is defined in the decision space (H) and it has its representation in the objective space ( f (H)). The definition of H is possible on the benchmarks used in this work (where the global and local optimum are known), but this definition is not trivial. Archive_nevMOGA and ArchiveU pdateP Q, D xy archivers discard solutions similar in both spaces at the same time. However, Archive-_nevMOGA, unlike ArchiveU pdateP Q, D xy , discards dominated solutions in their neighborhood. targetSelect looks for diversity in both spaces simultaneously. On the one hand, defining H as the optimal and nearly optimal solutions that are not similar in both spaces at the same time, would give the archive ArchiveU pdateP Q, D xy an advantage. If H is defined as the set of optimal and nearly optimal solutions non-dominated in their neighborhood, Archive_nevMOGA would be benefited. However, the archivers have a common goal: to obtain the solutions close to the optimals in the objective space but significantly different in the decision space (potentially useful solutions). Consequently, to be as "fair" as possible we must define H as the set that defines the common objective. Thus, the potentially useful solutions can be represented by the local Pareto set (see Definition 6).

Definition 6.
(Local Pareto set [5,39]): is the set of solutions in Q that is non-dominated by any another neighbor solution in Q (where n is a small positive number): H := {x ∈ Q| y ∈ Q : ||x − y|| ∞ ≤ n and y x} Figure 2 shows an example of a MOOP with two design objectives and two decision variables. Sets SET 1 and SET 2 form the global Pareto set. Both sets, together with SET 3 and SET 4 form the local Pareto set (since the global Pareto set is also a local Pareto set). No solution of a local Pareto set is dominated by a neighboring solution. Furthermore, all the solutions neighboring the local Pareto set are dominated by a neighboring solution, and therefore they are not part of this set. This can be verified by the colored areas around the sets SET 1 , SET 2 , SET 3 and SET 4 . For example, solutions in the gray area, which are neighboring solutions to SET 3 , obtain a worse objective value than SET 3 . For this reason, the solutions of the gray area are dominated by neighboring solutions, and therefore are not part of the local Pareto set. Sets SET 3 and SET 4 provide the DM with alternatives potentially useful (significantly different to SET 1 and SET 2 ), enabling the DM to make a more informed final decision. For this work, the −dominated solutions (solutions that are not in the set P Q, ) will not be considered to be local Pareto solutions (H) because their degradation in performance is significant for the DM.
Visualization of an MOOP in the objective space (on the left) and decision space (on the right). The sets SET 1 and SET 2 are the optimal solutions. Both sets and SET 3 and SET 4 form the local Pareto set.

Description of the Compared Archivers
As already discussed above, nearly optimal solutions can be very useful. However, it is necessary to discretize this set in order to find a reduced set of solutions to avoid the problems associated with an excessive number of solutions. Furthermore, it is necessary not to neglect the potentially useful nearly optimal solutions, i.e., nearly optimal alternatives significantly different (in the decision space) to the optimal solutions. To achieve both purposes, it is essential to discretize the set of solutions taking into account the decision and objective spaces at the same time. In this section, three archivers that discretize the set P Q, in both spaces are described.
There are MMEAs and algorithms that consider nearly optimal solutions that offer these solutions: P Q, -NSGA-II [28], P Q, -MOEA [30], nevMOGA [29], N SGA [18], DIOP [10], 4D-Miner [40,41], MNCA [42]. P Q, -NSGA-II [28] was one of the first algorithms aimed at finding approximate (nearly optimal) solutions. P Q, -NSGA-II uses the same classification strategy as the algorithm on which it is based, NSGA-II [43], and therefore, the highest pressure of the population is taken toward the Pareto set. Thus, this may result in the neighborhoods with only nearly optimal solutions not being adequately explored [30]. To avoid this problem, the algorithm P Q, -MOEA [30] is created. This algorithm was designed to avoid Pareto set bias. Nevertheless, P Q, -MOEA does not take into account the location of solutions in the decision space. P Q, -MOEA does not then guarantee that the potentially useful alternatives will not be discarded. To overcome this problem, the nevMOGA [29] algorithm was designed. This algorithm seeks to ensure that the potentially useful alternatives are not discarded. DIOP [10] is a set-based algorithm that can maintain dominated solutions. This algorithm simultaneously evolves two populations A and T. Population A approaches the Pareto front, and is not provided to the DM, while T is the target population that seeks to maximize diversity in the decision and objective spaces. MNCA [42] is an evolutive algorithm that simultaneously evolves multiple subpopulations. In MNCA each subpopulation converges to a different set of non-dominated solutions. Finally, 4D-Miner [40,41] is an algorithm especially designed for functional brain imaging problems.
One of the crucial points in an evolutionary multi-objective optimization algorithm is the archiving strategy. The P Q, -NSGA-II and P Q, -MOEA algorithms share the ArchiveU pdateP Q, archiver. This archiver seeks to characterize all nearly optimal solutions without taking into account the decision space. In [19], different archiving strategies are compared: ArchiveU pdateP Q, , ArchiveU pdateP Q, D x , ArchiveU pdateP Q, D y and ArchiveU pdateP Q, D xy . On the one hand, ArchiveU pdateP Q, gets an excessive number of solutions. On the other hand, ArchiveU pdateP Q, D x , ArchiveU pdateP Q, D y do not discretize the decision and objective spaces simultaneously. Therefore, these archivers do not achieve the two purposes discussed above. The mentioned work concludes that archiver ArchiveU pdateP Q, D xy is most practical use within stochastic search algorithms. Furthermore, this archiver is the only one of the archivers compared in this paper that discretizes the decision and objective spaces simultaneously [27], a factor that we consider necessary to obtain potentially useful solutions. The archiver ArchiveU pdateP Q, D xy has been employed in the recent N SGA algorithm to maintain a well-distributed representation in the decision and objective spaces. For this reason, the present work compares the archiver ArchiveU pdateP Q, D xy and not ArchiveU pdateP Q, , ArchiveU pdateP Q, D x and ArchiveU pdateP Q, D y .
The second archiver included in this comparison is the archiver of the nevMOGA algorithm (Archive_nevMOGA). This archiver characterizes the set of potentially useful solutions by discretizing both spaces simultaneously. Finally, the archiver of the DIOP algorithm (targetSelect) is also compared in this work. targetSelect seeks to find the population that maximizes an indicator that measures diversity in the decision and objective spaces simultaneously. Therefore, a metric-based archiver targetSelect is compared to the distance-based archivers ArchiveU pdateP Q, D xy and Archive_nevMOGA. The three archivers compared in this work seek to characterize the potentially useful solutions. The archiver of the MNCA algorithm has not been included in the comparison because it looks for non-dominated solutions. The archiver of the 4D-Miner algorithm has also not been included in the comparison because 4D-Miner is a very specific algorithm for functional brain imaging problems.

ArchiveU pdateP Q, D xy
ArchiveU pdateP Q, D xy is the proposed archiving strategy in [19] (see Algorithm 1). As already mentioned, potentially useful solutions are those that obtain similar performance (in the design objectives) but differ significantly in their parameters (in the decision space). This archiver aims to maintain these solutions. This archiver uses, in addition to the parameter (maximum degradation acceptable to the DM, see Definition 4), the parameters ∆ x and ∆ y . Two solutions are considered similar if their distance in the decision space is less than ∆ x . Therefore, the parameter ∆ x is the maximum distance, in the decision space, between two similar solutions. Two alternative solutions obtain a similar performance if their distance in the objective space is less than ∆ y . Therefore, the parameter ∆ y is the maximum distance between two solutions to be considered similar in the objectives space. Both parameters are measured using the Hausdorff distance [34] (d H , see Equation (3)). The archive A stores the set of obtained alternatives. A new solution p from P (new candidate solutions) will only be incorporated in A if: (1) p is a nearly optimal solution; and (2) A does not contain any solution similar to p, in the decision and objective spaces at the same time. If the new solution p is stored in archive A, the new set of optimal and nearly optimal solutions (Â) belonging to archive A is calculated. Thus, a solution a ∈ A will be removed if: (1) it is not a nearly optimal solution (p ≺ −( +∆ y ) a); and (2) the distance to the setÂ fulfills the condition dist(a,Â) ≥ 2∆ x .
for all a ∈ A \Â do 7: if p ≺ −( +∆ y ) a and dist(a,Â) ≥ 2∆ x then The archiver ArchiveU pdateP Q, D xy goes through all the set of candidate solutions p ∈ P, and in the worst case, the algorithm compares them with all the solutions a ∈ A. Thus, the complexity of the archiver is O(|P||A|) [44]. Also, ArchiveU pdateP Q, D xy has a maximum number of solutions |A(D xy )| [19] which is given by: where |A(D x xy )| is the maximum number of neighborhoods that the decision space can contain (based on ∆ x ) and |A(D y xy )| is the maximum number of solutions that can exist in each neighborhood (based on ∆ y and ), and are defined as: x j and x j are the bounds in the decision space and M j and m j are the bounds in the objective space of the set to discretize f (P Q, +2∆ y ). Also, it is assumed that any i is greater than ∆ y .

Archive_nevMOGA
Archive_nevMOGA is the archiving strategy used by the nevMOGA evolutionary algorithm [29]. This archiver, just as ArchiveU pdateP Q, D xy , aims to guarantee solutions that obtain similar performance to the optimals, but are significantly different in the decision space. The archiver Archive_nevMOGA uses the same three parameters as ArchiveU pdateP Q, D xy ( , ∆ x , ∆ y ). However, there are differences in the definition of some parameters in this archiver with respect to ArchiveU pdateP Q, D xy : (1) ∆ x is a vector that contains the maximum distances (in the decision space) between similar solutions for each dimension. Thus, two individuals a and b are similar if: (2) ∆ y is also a vector that contains the maximum distances (in the objective space) between solutions with similar performance for each dimension. Thus, two individuals a and b have a similar performance if: The archiver Archive_nevMOGA will add a new candidate solution p to the archive A if the following conditions are met simultaneously: (1) p is a nearly optimal solution; (2) there is no similar solution to p (in the decision space) ∈ A that dominates it; and (3) there is no similar solution to p in A in both spaces at the same time (if it exists, and p dominates it, it will be replaced). If a solution p is incorporated in the archive A, it will remove from A: (1) the similar individuals (in the parameter space) that are dominated by p and (2) the individuals −dominated by p.
The complexity of the archiver Archive_nevMOGA is equivalent to the complexity of ArchiveU pdateP Q, D xy previously defined (O(|P||A|)). Moreover, Archive_nevMOGA has a maximum number of solutions |A(nMOGA)| which is given by: where |A(nMOGA)| x is the maximum number of neighborhoods that the decision space can contain (based on ∆ x ) and |A(nMOGA)| y is the maximum number of solutions that can exist in each neighborhood (based on ∆ y and ), and are defined as: where n_box i = (M i − m i )/∆ y i , n_box max = max i n_box i and M j and m j are the bounds in the objective space of the set to discretize f (P Q, ). The archive size with respect to the decision space is equivalent for the compared archivers (A(D x xy ) and A(nMOGA x )). However, there is a difference between A(D y xy ) and A(nMOGA y ) (objective space). The archiver Archive_nevMOGA, unlike ArchiveU pdateP Q, D xy , discards nearly optimal solutions dominated by a similar solution in the decision space. Thus, in the worst case, Archive_nevMOGA will obtain the best solutions (non-dominated) in each neighborhood. These solutions will have a maximum number of alternatives depending on ∆ y . However, ArchiveU pdateP Q, D xy , in the worst case, will obtain, in each neighborhood, in addition to the best solutions, additional solutions. These additional solutions are dominated by neighboring solutions, but are considered solutions with different performance (based on ∆ y ). As a result, the archive of Archive_nevMOG A will have fewer solutions than the archive of ArchiveU pdateP Q, D xy . Therefore, we can deduce that the archiver Archive_nevMOG A has a lower computational cost than ArchiveU pdateP Q, D xy because its archive contains fewer solutions (the candidate solutions are compared with a smaller number of solutions).

targetSelect
targetSelect is the archiving strategy used by the DIOP evolutionary algorithm [10]. This archiver seeks to obtain a diverse set of solutions (keeping solutions close to the Pareto set) in the decision and objective spaces. A = targetSelect(F, T, µ t , ) has as inputs: an approximation to the Pareto front F, the set of solutions to be analyzed T, the size of the set target µ t , and . This archiver selects µ t solutions from set T. The goal is to find the population A, with size µ t , to maximize G(T) (see Equation (13)). G(T) is defined as the sum of the product between a metric and its respective weight.
where D o and D d are metrics that measure diversity in the objective and decision spaces respectively, and q F is a distance metric defined as: D o is an indicator that measures diversity and convergence to the Pareto front, and D d is an indicator that measures diversity in the decision space. In this work, as in [10], D o and D d were specified by the hypervolume indicator [45] and the Solow-Polasky diversity measure [46], respectively. An advantage of the archiver targetSelect is that you can directly and arbitrarily specify the archive size.

Materials and Methods
In this section, the MOOPs, on which the three archivers will be compared, will be defined. In addition, the methodology for carrying out the comparison is introduced.

Definition of MOOPs
The archivers will be compared on two benchmarks and a real engineering example. Benchmarks have a very low computational cost for the objective function. For this reason, it is inexpensive to obtain a target set (with a very fine discretization in the decision space) H. This discrete set is necessary for the use of the metric ∆ p (see Section 3). Furthermore, the definition of this set must be as "fair" as possible. Therefore, for the benchmarks, the target set H is defined as the local and global Pareto set (see justification in Section 3). This set is obtained by discretizing the decision space with 10,000,000 solutions in the range of the parameters. Furthermore, the target set H has its representation in the objective space. In the real engineering example, obtaining a set H would be computationally very expensive (or even unaffordable). Therefore, this set is not defined in this MOOP. By means of these problems it is possible to analyze the behavior of the archivers for different characteristics in the MOOP: multimodal solutions; local Pareto sets; or discontinuous Pareto fronts. However, there are other features of MOPs that are not analyzed in this article (such as MOPs with many objectives and/or decision variables).

Benchmark 1
Benchmark 1 (see Equation (15)) is a test problem called SYM-PART defined in [47] widely used in the literature [18,20,[48][49][50] for the evaluation of algorithms that characterize nearly optimal or multimodal solutions. Benchmark 1 has the Pareto set located in a single neighborhood, and it also has eight local Pareto set that overlap in the objective space (see Equation (20) and Figure 3). Thus, this benchmark is very useful to observe if the compared archivers can adequately characterize the nine existing neighborhoods, and provide all the existing diversity to the DM.
where t 1 = sgn(x 1 )min |x 1 |−a−c/2 2a+c , 1 and δ t = 0 f or t 1 = 0 and t 2 = 0 0.1 else subject to: using a = 0.5, b = 5 and c = 5. This MOOP contains one global Pareto set: as well as the following eight local Pareto sets: Therefore, the target set H is defined as: To evaluate the archivers on benchmark 1, the parameters are defined in Table 1. The parameters , ∆ x and ∆ y are defined based on prior knowledge of the problem. The targetSelect archiver is based on an indicator, and therefore has different parameters from the rest of the compared archives. For the choice of the parameter µ t , based on the size of the archives obtained by the rest of the archivers, the following values have been analyzed: µ t = {100, 75, 50}. For the choice of the ω o parameter, the following parameters suggested in [10] have been analyzed: ω o = {0, 0.7692, 0.9091, 0.9677, 1}. Among all these values, µ t = 100 and ω o = 0.9677 have been defined as the parameters that obtain the best performance, with respect to ∆ p , for the uniform dispersion. For random dispersion, ω o = 0.7692 obtain the best performance.

Benchmark 2
The benchmark 2 (see Equation (21)) is an adaptation of the modified Rastrigin benchmark [51][52][53][54]. Figure 4 show the global and local Pareto set H of benchmark 2. This benchmark has a discontinuous Pareto front made up of solutions in different neighborhoods. In addition, it also provides nearly optimal solutions significantly different from the optimal solutions (in different neighborhoods) .
where k 1 = 2 and k 2 = 3, and subject to: To analyze the archivers on the benchmark 2, we define the parameters of Table 2. The parameters , ∆ x and ∆ y are defined based on prior knowledge of the problem. Following the same procedure as the previous benchmark, µ t = 150 (in this case µ t = {200, 150, 100} has been analyzed) and ω o = 0.9677 is defined as the parameters that obtains the best performance, with respect to ∆ p , for the uniform dispersion. For random dispersion, ω o = 0.9091 obtains the best performance.

Identification of a Thermal System
Finally, a MOOP is defined to solve a real engineering problem: identification of a thermal system. In this problem, the energy contribution inside the process is due to the power dissipated by the resistance inside it (see Figure 5). Air circulation inside the process is produced by a fan, which constantly introduces air from outside. The actuator is made up of a voltage source which is controlled by voltage. The actuator input range is [0 100] % ([0 7.5] V). Two thermocouples are used to measure the resistance temperature and the air temperature in the range [−50 250] • C. Figure 6 shows the signals that will be used in the identification process. The ambient temperature T a is considered constant and equal to 17 • C for the entire identification test. Taking into account the physical laws of thermodynamics [55], the initial structure of the model can be defined using the following differential equations, where heat losses due to convection and conduction are modeled, as well as losses due to radiation: (24) where T(t) is the process output temperature and state variable in • C, v(t) is the input voltage to the process in volts, T a (t) is the ambient temperature in • C and x = [x 1 x 2 x 3 ] are the parameters of the model to estimate. The MOOP is defined as follows: subject to: τ = 2500 is the duration of the identification test, variables with circumflex accent are process outputs (experimental data), variables without circumflex accent are the model outputs, x the parameter vector: and x and x (see Table 3) the lower and upper limits of the parameter vector x which define the decision space Q.  In this MOOP, the design objectives measure the mean and maximum error between the temperatures of the process outlet and the model. The parameters to be estimated and the design objectives have a physical meaning. This fact makes it easier to choose the , ∆ x and ∆ y (see Table 4). The parameter (maximum acceptable degradation) is the same for all archivers. ∆ y is similar for ArchiveU pdateP Q, D xy and Archive_nevMOGA, taking into account the difference in vector size. However, ∆ x is different for ArchiveU pdateP Q, D xy and Archive_nevMOGA. For ArchiveU pdateP Q, D xy , ∆ x = 0.1. A lower value increases significantly higher number of solutions. A higher value gives a poor approximation to the set of optimal and nearly optimal solutions. For Archive_nevMOGA, ∆ x = [0.0015 0.2 0.1]. In this case, ∆ x is independent for each dimension in the decision space, being different for each parameter due to its different limits (see Table 3). For targetSelect, µ t = 78 solutions to obtain the same number of solutions as Archive_nevMOGA. In this way, both archivers will have equal conditions. Additionally, ω o = 0.7692 is defined to give greater weight to diversity in the decision space.

Archivers Comparison Procedure
The procedure performed to carry out the archiver comparison is different on the benchmarks and on the real example. Benchmarks have a very low computational cost for the objective function. Therefore, it is possible to evaluate large populations that discretize the entire search space. These populations are entered in the archiver as input population. To analyze the behavior of the archivers on different types of populations, a uniform and random distribution is used to obtain these initial populations.
Because the computational cost of the objective functions of the engineering problem are significantly higher, it is not feasible to test each archiver with random or exhaustive searches as has been done with the benchmarks. Thus, the archiver has been embedded in a basic evolutionary algorithm. In addition to the reduction of computational cost, this enables observing the impact of each archiver when incorporated into an evolutionary mechanism.

Benchmarks
For the comparison of the archivers on the two benchmarks presented, the archivers will be fed in two ways: (1) by a uniform distribution; (2) by a random distribution of solutions in the search space. The comparison of the results will be made using the averaged Hausdorff distance ∆ p [34] (see Equation (4)). This metric measures the averaged Hausdorff distance between the outcome set (or final archive) f (A) and the target set f (H) (local Pareto set in this paper) of a given MOOP. Since ∆ p considers the averaged distances between the entries of f (A) and f (H), this indicator is very insensitive to outliers. This metric measures the diversity and convergence towards the target set. Furthermore, this metric can be applied both in the decision and objective space.
To carry out the comparison of the archiving strategies with data of a uniform dispersion, each archiver is fed with a file of 100,000 solutions uniformly distributed throughout the domain Q (generating a hypergrid). These solutions are introduced in a random order. This process is repeated with 25 different files, where each file slightly displaces the generated hypergrid vertically and/or horizontally on the search space.
To carry out the analysis with data of a random dispersion, each archiver is fed with a file of 100,000 solutions randomly distributed throughout the domain Q. This process is repeated with 25 different files, avoiding as far as possible the random component.

Identification of Thermal System
In this example, we are going to analyze the different archivers on a multi-objective generic optimization algorithm [23,34] (see Algorithm 2). This algorithm generates the initial population randomly with Nind P 0 individuals, obtaining the initial archive A 0 (through the archiver). Subsequently, in each iteration, new solutions are generated and the archive A t is updated (using the selected archiver). The Generate() function generates new individuals in each iteration. To do this, two solutions are randomly selected from the current file A t−1 . A random number u ∈ [0 1] is generated. If u > Pcm (probability of crossing/mutation) a crossing is made. The crossover generates two new solutions using the simulated binary crossover (SBX [56]) technique. If u ≤ Pcm a mutation is performed. The mutation generates two solutions through the polynomial mutation [43]. In this way, the three archivers are compared using the same evolutionary strategy. For this example, an initial population of 500 individuals (Nind P 0 = 500), a probability of crossing/mutation of 0.2 (Pcm = 0.2), and 5000 generations are used. Therefore, 10,500 solutions are evaluated (500 + 2 × 5000) for each archiver.

Results and Discussion
This section shows the results obtained on the two benchmarks and the real example previously introduced.

Benchmark 1 with Uniform Dispersion
The archivers are tested on 25 different input populations obtained by uniform dispersion. Figure 7 shows the median results of archive A on the benchmark 1 for both decision and objective spaces with respect to ∆ p (A, H) in decision space. As can be seen, the archivers make a good approximation to the target set H, characterizing the nine neighborhoods that compose it. However, there are differences between the sets found by the archivers. First, the Archive_nevMOGA archiver obtains fewer solutions. The number of solutions µ t = 100 for the targetSelect is user-defined, but a smaller size makes ∆ p worse for both spaces. The archive A obtained by ArchiveU pdateP Q, D xy obtains a larger number of solutions.
The targetSelect archiver obtains a better approximation to the Pareto front than the other archivers. This is because the weight ω o has a high value, giving greater weight to the D o indicator that measures convergence and diversity for the Pareto front. The ArchiveU pdateP Q, D xy and Archive_nevMOGA archivers do not select a candidate solution p (even if p belongs to the Pareto set) if an alternative already exists in the current archive that is similar in both spaces (in Archive_nevMOGA, p is selected if it dominates the similar solution). These archivers could obtain a better approximation to the Pareto front by reducing the parameter ∆ y (parameter with which the degree of similarity in the objective space is decided), but it probably also implies obtaining a greater number of solutions.
Regarding the local Pareto set, the archive Archive_nevMOGA obtains a better approximation in the comparison. ArchiveU pdateP Q, D xy and targetSelect obtain solutions in all neighborhoods where nearly optimal solutions exist. However, these solutions are rarely located on the lines that define the local Pareto set.
Notice that ArchiveU pdateP Q, D xy and Archive_nevMOGA archivers obtain − dominated solutions. In ArchiveU pdateP Q, D xy , it is possible that a solution in A that is no longer nearly optimal due to the apparition of a new candidate solution p. p may not be removed because it does not satisfy condition in the line 7 of Algorithm 1. In Archive_nevMOGA, a new candidate solution p can be added to the archive A through the condition of line 8 of Algorithm 3. In some cases, solutions that are not nearly optimal due to the appearance of p (by line 8 of Algorithm 3) are not eliminated. Therefore, the archive A obtained by both archivers may contain solutions that do not belong to the nearly optimal set.  Archive_nevMOGA achieves a better approximation in the decision and objective spaces, and obtains fewer solutions. targetSelect also obtains a better approximation in both spaces with fewer solutions than ArchiveU pdateP Q, D xy . ArchiveU pdateP Q, D xy obtains greater variability among the 25 archives obtained. Therefore, Archive_nevMOGA has achieved, in a better way, the two main objectives: not neglecting the diversity of solutions (locates all nine neighborhoods) and obtains a reduced number of solutions (simplifying the optimization and decision stages). if ( a 1 ∈ A : a 1 ≺ − p) and ( a 2 ∈ A : |a 2 − p| ≤ ∆ x and a 2 ≺ p) and ( a 3 : if ∃a 4 ∈ A : p ≺ − a 4 or |a 4 − p| ≤ ∆ x and p ≺ a 4 then

Benchmark 1 with Random Dispersion
The archivers are tested on 25 different input populations obtained by random dispersion. Figure 9 shows the archive A, with median result for ∆ p (A, H). The archivers characterize the nine neighborhoods that form the target set H. The archive A obtained by Archive_nevMOGA has a smaller number of solutions. Decreasing the number of solutions µ t ≤ 100 for the archive targetSelect makes ∆ p worse in both spaces. Comparing Figures 7 and 9, targetSelect in random dispersion produces a worse approximation of the Pareto front than in uniform search. This is for two reasons: (1) the lower value of the weight ω o = 0.7692 (lower weight of the metric D o , which measures convergence in Pareto front); (2) the initial population has been obtained in a random way (meaning certain areas have not been adequately explored). Figure 10 shows the boxplot, of the 25 archives obtained in the tests. Archive_nevMOGA obtains better results in both spaces, also obtaining a smaller number of solutions. On the benchmark 1, the approximations obtained by the archivers in a random dispersion are slightly worse than in a uniform dispersion.

Benchmark 2 with Uniform Dispersion
The archive A, with the median result (in the decision space), for each archiver is shown in Figure 11. The archivers locate the neighborhoods where nearly optimal solutions are found. The archive of Archive_nevMOGA again obtains fewer solutions. Keep in mind that decreasing µ t = 150 for targetSelect causes a considerable increase in the variability of the results obtained ∆ p (A, H) for the 25 tests. targetSelect obtains a better approximation to the Pareto front due to the high value of the weight ω o . However, Archive_nevMOGA obtains solutions closer to the local Pareto set. This is because targetSelect seeks to achieve the greatest diversity in the decision space (through D d ) without taking into account whether these solutions are worse than a close solution (if they are not optimal). Figure 12 shows the boxplot, of the 25 archives obtained for the archivers, for the indicator ∆ p in the decision and objective spaces and the archive size. Regarding the decision space, Archive_nevMOGA obtains a better approximation to the target set than its competitors. Regarding the objective space, Archive_nevMOGA and targetSelect obtain a similar minimum value. However, targetSelect obtains worse variability. Therefore, as occurred with benchmark 1, the archiver Archive_nevMOGA achieves a better approximation in both spaces and obtains a smaller number of solutions.  Figure 13 shows the archive A, which obtains a median result for ∆ p (A, H) for the archivers. Again, the archiver Archive_nevMOGA obtains significantly fewer solutions while also obtaining a better characterization of the target set H. Figure 14 shows the boxplot of the archivers. The archiver Archive_nevMOGA obtains a better value of ∆ p in both spaces. Regarding the objective space, targetSelect obtains results similar to Archive_nevMOGA but worse variability. The archiver Archive_nevMOGA obtains fewer solutions, which simplifies the optimization and decision stages. Using the random search, ArchiveU pdateP Q, D xy and Archive_nevMOGA perform slightly worse than the uniform search. targetSelect obtains slightly better results than the uniform search.  Figure 15 shows the final archive A obtained when using the three archivers inside a compared basic optimization algorithm. In this example, the obtained archives are compared by pairs to better observe the differences between them.

Identification of a Thermal System
The three archivers obtain diversity in the decision space, and convergence in the Pareto front. The first thing that stands out is the large number of solutions (610 solutions) obtained by the archive ArchiveU pdateP Q, D xy . This high number of solutions complicates the optimization and decision phases. For each iteration, the newly generated solutions must be compared with the solutions in the current file A(t). Therefore, many solutions in the file A(t) implies a higher computational cost. In addition, a high number of solutions makes the final decision of the DM more difficult. This large number of solutions can be reduced by increasing the parameters ∆ x and ∆ y . However, this increase also implies a worse discretization in both spaces. ArchiveU pdateP Q, D xy obtains a worse approximation to the Pareto front.
The results show more similarities with respect to the other two archivers. Archiver Archive_nevMOGA obtains 78 solutions. To compare under similar conditions, we set the size of the file obtained by targetSelect to 78 solutions (µ t = 78). In this way, both archivers obtain the same number of solutions. The set of solutions found by both archivers are different. With respect to the Pareto front, both archivers achieve a good approximation in the range f 1 (x) = [0.3 0.8]. However, Archive_nevMOGA does not get solutions in the range f 1 (x) = [0.8 0.9] of the Pareto front. Therefore, in this example, targetselect gets a little more diversity in the Pareto front.
targetSelect focuses on obtaining, in addition to a good convergence in the Pareto front, the greatest diversity in the decision space (using D d , see Section 4.3). However, solutions that provide greater diversity may be worse than neighboring solutions. For example, Figure 15 shows the solution x 1 obtained by Archive_nevMOGA. This solution has similar parameters to the neighborhood 1 solutions (see decision space). x 1 performs better ( f (x 1 )) than all the neighborhood 1 solutions obtained by targetSelect. Therefore, Archive_nevMOGA would eliminate all these solutions (dominated in its neighborhood by x 1 ). targetSelect maintains them because they increase the diversity in the decision space. This happens repeatedly in this MOOP. For this reason, targetSelect obtains nearly optimal solutions farther from the Pareto front than obtained by Archive_nevMOGA. These solutions are in the contour/ends of the plane that form the optimal and nearly optimal solutions in the decision space, and therefore, they obtain a better diversity under the D d indicator. Archive_nevMOGA could find solutions closer to the contour/ends of the plane formed in the decision space (as is the case with targetSelect) by reducing the parameter ∆ x , although this would imply obtaining a larger number of solutions. Therefore, depending on the needs or preferences of the DM, the use of one archiver or another may be more appropriate. This archiver can be embedded in most of the multi-objective algorithms avail-able. ArchiveU pdateP Q, D xy , Archive_nevMOGA and targetSelect archivers are currently built into the algorithms N SGA, nevMOGA and DIOP respectively.

Conclusions
In this paper, the characterization of nearly optimal solutions potentially useful in a MOOP has been addressed. In this type of problem, in practice, the DM may wish to obtain nearly optimal solutions, since they can play a relevant role in the decision-making stage. However, an adequate approximation to this set is necessary to avoid an excessive number of alternatives that could hinder the optimization and decision-making stages. Not all nearly optimal solutions provide the same useful information to the DM. To reduce the number of solutions to be analyzed, we consider potentially useful solutions (in addition to the optimals) that are close to the optimals in objective space-but which differ significantly in the decision space. To adequately characterize this set, it is necessary to discretize the nearly optimal solutions by analyzing the decision and objective spaces simultaneously.
This article compares different archiving strategies that perform this task: Archive-U pdateP Q, D xy , Archive_nevMOGA and targetSelect. The main objective of the archivers is to obtain potentially useful solutions. This analysis is of great help to designers of evolutionary algorithms who wish to obtain such solutions. In this way, designers will have more information to choose their archivers based on their preferences. ArchiveU pdateP Q, D xy and Archive_nevMOGA are two distance-based archivers. Both archivers simultaneously discard solutions that are similar in decision and objective spaces. However, Archive_nevMOGA, in contrast to ArchiveU pdateP Q, D xy , discards solutions dominated by a neighboring solution in the decision space. targetSelect is an archive based on an indicator that measures the diversity in both spaces simultaneously. targetSelect, unlike the other archivers, can directly and arbitrarily specify the archive size. This can be an advantage. The archivers are evaluated using two benchmarks. They obtain a good approximation to the set of potentially useful solutions, characterizing the diversity existing in the set of nearly optimal solutions. As discussed in [19], the ArchiveU pdateP Q, D xy archiver is more practical than other archivers in the literature. However, this archiver, as demonstrated in this paper, obtains significantly more solutions than its competitors in this paper. This can make the optimization and decision phase more difficult. In addition, Archive_nevMOGA obtains a better approximation to the target set H under the averaged Hausdorff distance ∆ p . In addition, Archive_nevMOGA obtains a smaller number of solutions, which speeds up the optimization process and facilitates the decision-making stage. However, fewer solutions can also decrease diversity (which can lead to degraded global search capabilities).
Finally, the compared archivers are analyzed on a real engineering example. This real example is the identification of a thermal system. To carry out this analysis, a generic multi-objective optimization algorithm is used, in which it is possible to select different archivers. This enables a more realistic comparison of the impact of the archivers on the entire optimization process.
The three archivers obtain the existing diversity in the set of optimal and nearly optimal solutions. In this last example, we can see how the archive obtained by ArchiveU pdateP Q, D xy obtains a very high number of solutions, complicating the optimization and decision stages. Archive_nevMOGA and targetSelect obtain the same number of solutions. Both archivers obtain an adequate Pareto front. However, targetSelect gets more diversity on the Pareto front. The main difference between the two archivers is in the set of nearly optimal solutions. On the one hand, Archive_nevMOGA obtains solutions closer to the Pareto front, but significantly different in the decision space. On the other hand, targetSelect obtains the solutions that provide the greatest diversity in the decision space, even though these solutions are farther away from the Pareto front, and therefore, offer significantly worse performance.
Finally, this analysis suggests two possible future lines of research: (1) design of new evolutionary algorithms, which characterize the nearly optimal solutions, using some of the archivers compared in this work and (2) design of new archivers that improve the current ones. For example, the clustering techniques could improve the archivers compared in this work. These techniques, not analyzed in this work, allow the location of new neighborhoods, allowing their exploration and evaluation. In this way, the optimization process could be improved.