We will in this section propose and discuss the novel archiver ArchiveUpdateHD. Since the considerations of the distances as well as the Hausdorff approximations can be done more accurately for —where we can assume the Pareto front to locally form a curve, and hence, the elements of the approximations can be arranged via a sorting in objective space—we first address the bi-objective case and will afterwards consider the archiver for problems with .
3.1. The Bi-Objective Case
The pseudocode of ArchiveUpdateHD for bi-objective problems is shown in Algorithm 2. This archiver aims for approximations of the Pareto front of a given BOP in the Hausdorff sense (i.e., for solutions that are evenly spread along the Pareto front). The archiver is based (i) on the distances among the candidate solutions (lines 18–36 of Algorithm 2), (ii) “classical” dominance or elite preservation (lines 5 and 9) as well as (iii) the concept of -dominance (line 5). The archiver can roughly be divided into two parts: an acceptance strategy to decide if an incoming candidate solution p should be considered (line 5), and a pruning technique (mainly lines 18–36, but also lines 11–14) which is applied if the size of the archive has exceeded a predefined budget N of archive entries.
In the following, we will describe ArchiveUpdateHD as in Algorithm 2 in more detail. This algorithm contains several elements that have to be incorporated in order to guarantee convergence. After the convergence analysis (Theorem 1), we will discuss more practical realizations of the algorithm.
In line 5, it is decided if a candidate solution
p should be (at least temporarily) added to the existing archive
A. This is the case if (a) none of the entries
-dominates
p (
being a safety factor needed to guarantee convergence, see below for practical realizations), or if (b) none of the entries
dominates
p and for none of the entries
the distance
is less or equal than
. Throughout this work,
denotes the Euclidean norm. We stress that this acceptance strategy is identical to the one of the archiver ArchiveUpdateTight2 [
14], which we will need for the upcoming convergence analysis.
If the candidate solution
p is accepted, it will be added to
A. Next, all other entries
dominated by
p will be discarded (lines 8–10). Hence, all archives generated by ArchiveUpdateHD only contain mutually non-dominated elements (elite preservation). If the distance
is larger than
for any of these dominated archive entries
a, a “reset” is executed for
and
:
is set to
(where
another safety factor). Next,
and
are updated using this new minimal value. The idea behind this reset is as follows: if
and the distance of
and
is larger than
, then
p and
a could be located in different connected components of the set of (local) solutions of the bi-objective problem. Since the values both of
and
are determined by the length of the (known) Pareto front, their values have to be set back, since a “jump” to a new connected component may lead to a new length. See
Figure 2 for a hypothetical scenario. The value of
has to be (slightly) increased in each reset in order to avoid the possible of a cyclic behavior in the sequence of archives (which, in fact, has not been observed in our computations).
If
exceeds the predefined magnitude
N, it is decided in lines 18–36 which of the elements of
A has to be discarded (pruning). For
objectives, we can order all the entries of the archives (e.g., as done here: in ascending order wrt objective
). Then, the vector
of distances can be simply computed via:
For an index
m chosen from
, either
or
is then removed from
A, which is done in lines 23–33. The aim of ArchiveUpdateHD is to maintain good approximations of the end points of the Pareto front. Accordingly,
, respectively,
, are always discarded instead of
and
, respectively (lines 23–26). The rationale behind the selection in lines 28–33 is to keep the archive of size
N with the most evenly distributed elements.
Algorithm 2 ArchiveUpdateHD |
Require: Problem (MOP), where , P: current population, : current archive, : current value of , : minimal value of , , : safety factors, N: upper bound for archive size |
Ensure: updated archive A, updated values for , , and |
1: |
2: |
3: |
4: for all do |
5: if then |
6: |
7: end if |
8: for all do |
9: if then |
10: |
11: if then ▹ reset and |
12: |
13: |
14: |
15: end if |
16: end if |
17: end for |
18: if then ▹ apply pruning |
19: |
20: |
21: sort A (e.g., according to ) |
22: compute as in (8) |
23: choose |
24: if then |
25: ▹ remove 2nd entry |
26: else if then |
27: ▹ remove 2nd but last entry |
28: else |
29: |
30: |
31: if then |
32: |
33: else |
34: |
35: end if |
36: end if |
37: end if |
38:end for |
39:return |
In the following, we investigate the limit behavior of ArchiveUpdateHD.
Theorem 1. Let (MOP) be given and be compact, and let there be no weak Pareto points in . Furthermore, let F be continuous and injective, andThen, an application of Algorithm 1, where ArchiveUpdateHD (Algorithm 2) is used to update the archive, leads to a sequence of archives , where the following holds: - (a)
There exists a and such that - (b)
There exists with probability one a such that is a -tight ϵ-approximate Pareto front with respect to (MOP) for all , where . - (c)
- (d)
There exists a such that
Proof. We first show that during the run of the algorithm, only finitely many changes of the value of (and hence also of ) can occur. Since F is continuous and the domain is compact, also the image is compact, and hence, in particular bounded. ArchiveUpdateHD changes the value of in two cases: if (i) a reset of and is executed (line 12) or if (ii) the pruning technique is applied (line 19). In case of (i), the value of is increased by a constant factor . The value of after the i-th reset is hence equal to or larger than , where denotes the value of at the start of the algorithm. A reset is applied if the distance of the image of the candidate solution p to the image of an archive element a is larger than the current value of (line 11). Since is bounded, only a finite number of such resets can be applied during the run of the algorithm.
Case (ii) happens if the magnitude of the current archive is
. New candidate solutions
p are added to the archive in lines 5 and 6 and lines 9 and 10. Lines 9 and 10 describe a dominance replacement which does not increase the magnitude of the archive. Hence, such replacements do not lead to an application of the pruning. A candidate
p can be further added to the current archive
A if one of the following statements is true (line 5):
Since
is bounded, there exists for every
a (large enough)
so that
, where
. Similarly,
if
is large enough. Since in each pruning step, the value of
is increased by the factor of
and since only finitely many resets are executed, also only finitely many prunings can be applied during the run of the algorithm.
Note that ArchiveUpdateHD differs from ArchiveUpdateTight2 in two parts: the reset strategy (lines 11–15) and the pruning technique (lines 18–37), and that both these parts come with a change of the values of
and
. In other words, ArchiveUpdateHD is identical to ArchiveUpdateTight2 as long as no change in
and
occurs. For this case, we can hence apply the theoretical results on ArchiveUpdateTight2 for ArchiveUpdateHD. Now, consider a fixed value of
(and hence also
). During the run, it can either be the case that (i) all magnitudes of
are less than or equal to
N (i.e., no pruning is applied), or that (ii) this magnitude is
at one point, leading to an application of the pruning technique. In case (i), we can use Theorem 7.4 of [
54] on ArchiveUpdateTight2: there exists with probability of one a
such that the sets
form a
-tight
-approximate Pareto front for all
. Note that once
forms such an object, no more resets can occur: assume there exists a candidate solution
p that dominates an element
, and where
. The latter means that
which in turn means that
a does not
-approximate
p, which is a contradiction to the assumption on
. In case of (ii), the value of
is simply not large enough for the
N-element archive to form a
-tight
- approximate Pareto front. Again, by Theorem 7.4 of [
54], there exists in this case with a probability of one a finite iteration number where the magnitude will exceed
N. As discussed above, the pruning can only be applied finitely many times during the run of the algorithm. Hence, the value of
will, with a probability of one, stay fixed from one iteration onwards, which proves part (a).
Parts (b) and (c) follow from Theorem 7.4 of [
54] and part (a), and finally, part (d) follows from parts (b) and (c) and the definition of the Hausdorff distance. □
Remark 1. - (a)
Equation (9) is an assumption that has to be made on the generation process. It means that every neighborhood of every feasible point will be “visited” with probability one by after finitely many steps. For MOEAs, this, e.g., ensured if Polynomial Mutation [70,71] is used or another mutation operator for which the support of the probability density functions equal to Q (at least for box-constrained problems). We hence think that this assumption is rather mild. - (b)
The complexity of the consideration of one candidate solution p is , which is determined by the sorting of the current archive A in line 20.
- (c)
and are safety factors needed to guarantee the convergence properties. In our computations, however, we have not observed any impact of these values if both are chosen near to one. We hence suggest to use (i.e., practically not to use these safety factors).
- (d)
The above consideration is done for , i.e., using the same value for all entries of ϵ. If the values for the objectives along the Pareto front differ significantly, one can of course instead use using different values . In that case, the following modifications have to be done: (i) the last condition in line 5 has to be replaced by Furthermore, (ii) the condition for the reset in line 11 has to be replaced by - (e)
The value of Δ computed throughout the algorithm yields an approximation quality of the archivers in the Hausdorff sense. The theoretical upper bound of the final value is twice the value of the actual Hausdorff approximation as the following discussion shows (refer to Figure 3): assume we are given a linear front with slope , and we are given a budget of elements (the discussion is analog for general N). The ideal archive as computed by ArchiveUpdateHD is in this case , where the ’s are the end points of the Pareto set. Assume we have and ; then, the Hausdorff distance of the Pareto front and A is determined by the point . Given this archive, for any value and assuming that is large enough, there exists a candidate p such that p is not dominated by or and that , . Hence, p will be added to the archiver—and later on discarded (lines 23–26). The latter leads to an increase of Δ. On the one hand, one suggesting strategy would be to take as a Hausdorff approximation of the Pareto front in particular, since most Pareto fronts have at least one element where the slope of the tangent space is . On the other hand, the use of ϵ-dominance prevents that the images , , are perfectly evenly distributed along the Pareto front so that is not that accurate for some problems. In fact, this factor of two can only be observed for linear fronts, while already yields a good approximation in general (see, e.g., the subsequent results for MOPs with more than two objectives). However, we have observed that the following estimation gives even better approximations of the Hausdorff distances: given , which is sorted (e.g., according to objective ), the current Hausdorff approximation h is computed as follows:Note that the distance is set to 0 if the distance between two neighboring candidate solutions is larger or equal to , which has been done to take into account approximations of Pareto fronts that fall into several connected components. - (f)
Several norms are used within the algorithm. While one is—except in line 11, see the above proof—in principle free for the choice of the norms, we suggest taking the infinity norm in line 5 in order to reduce the issue mentioned in the previous part, and the 2 norm in lines 28 and 29 in order to obtain a (slighly) better distribution of the entries along the Pareto front.
Algorithm 3 shows the modifications of ArchiveUpdateHD discussed above, which have been used for the calculations presented in this work. Hereby, denotes the vector of minimal elements for each entry .
Remark 2. For the performance assessment of MOEAs, it is typically advisable to take instead of the Hausdorff distance the averaged Hausdorff distance . The main reason for this is that MOEAs may compute a few outliers in particular if the MOP contains weakly dominated solutions that are not optimal (also called dominance resistance solutions [72]). On the other hand, we stress that , opposed to , is not a metric in the mathematical sense, since the triangle inequality does not hold. We refer, e.g., to [13,38,73,74] for more discussion on this matter. In the following, we discuss one possibility to obtain an approximation of the value of from a given archive A. To this end, we first investigate the value of if the elements of A are perfectly located around a linear connected Pareto front (if N is large enough, we can expect that this approximation works fine for any connected Pareto front). That is, all values are optimal. Furthermore, if A is sorted, ) and are the end points of the Pareto front, and the distance of two consecutive elements and is given by (leading to ). Since all the values are optimal, the value is hence given by the value of , which can be computed as follows: Hereby, we have used the formulation of for continuous Pareto fronts as discussed in [73]. It remains to compute h. Since the assumption that all the images of the values are evenly spread is ideal, we cannot simply take for an arbitrarily index . Instead, it makes sense to use the average of these distances:where is as in (12) and m denotes the number of elements of that are not equal to zero. This leads to the approximation of the averaged Hausdorff distance of the Pareto front by a given archive A: In order to obtain a first impression on the effect of the archiver, we apply it to several test problems. More precisely, we use ArchiveUpdateHD together with the generator, which is simply choosing candidate solutions uniformly at random from the domain of the problem. As test problems, we use CONV (convex front), DENT ([
75], convex-concave front), RUD1 and RUD2 (disconnected fronts), LINEAR (linear front) and RUD3 (convex front). The first five test problems are uni-modal, while RUD3 has next to the Pareto front eight local fronts. RUD3 is taken from [
76], and RUD1 and RUD2 are straightforward modifications of RUD3 to obtain the given Pareto front shapes.
Figure 4 shows the final approximations of the fronts using
for the archive size and initial values of
small enough so that this threshold is reached for all problems. As it can be seen, in all cases, evenly distributed solutions along the Pareto fronts have been obtained.
Figure 5 shows the actual Hausdorff and averaged Hausdorff values of the computed archives in each step for one run of the algorithm (
and
, i.e.,
has been used for the averaged Hausdorff distance), together with their approximations
h and
. For all problems, the archiver is capable of quickly determining a good approximation of both
and
during the run of the algorithm.
Table A1 and
Table A2 show the approximation qualities averaged over 30 independent runs, which support the observations from
Figure 5.
Figure 6 shows the evolution of the value of
during one run of the algorithm for DENT and RUD3. For the uni-modal problem DENT, the value of
is essentially increasing monotonically (i.e., not counting the first few iteration steps), while for the multi-modal problem RUD3, more than 10 restarts occur. Nevertheless, in both cases, a final value
is reached, which is in accord with Theorem 1.
Figure A1 shows the box collections
of the final archives
and the final value
for the test problems, where
denotes the
-ball around
x using the maximum norm. The figure indicates that the Hausdorff distance of
and the respective Pareto fronts is indeed less or equal to
for all problems.
Algorithm 3: |
Require: Problem (MOP), where , P: current population, : current archive, : current values of , N: upper bound for archive size |
Ensure: updated archive A, updated values for , Hausdorff approximation h |
1: |
2: |
3: |
4: for alldo |
5: if then |
6: |
7: end if |
8: for all do |
9: if then |
10: |
11: if then ▹ reset and |
12: |
13: |
14: end if |
15: end if |
16: end for |
17: if then ▹ apply pruning |
18: |
19: |
20: sort A (e.g., according to ) |
21: compute as in (8) |
22: choose |
23: if then |
24: ▹ remove 2nd entry |
25: else if then |
26: ▹ remove 2nd but last entry |
27: else |
28: |
29: |
30: if then |
31: |
32: else |
33: |
34: end if |
35: end if |
36: end if |
37: end for |
38: sort A (e.g., according to ) ▹ compute Hausdorff approximation |
39: compute , as in (12) |
40: |
41: return |
3.2. The General Case
Next, we consider the archiver for MOPs with more than two objectives. Algorithm 4 shows the pseudocode of ArchiveUpdateHD for such problems. The archiver is essentially identical to the one for BOPs; however, it comes with two modificatons, since one cannot expect the Pareto front to form a one-dimensional object any more and another one prevents too many unnecessary resets during the run of the algorithm.
The distances cannot be be sorted any more as in (
8). Instead, one has to consider the distances
for a given archive
A. Furthermore, more sophisticated considerations of the distances as, e.g., in lines 27 and 28 of Algorithm 3 cannot be considered any more. Instead, we have chosen to first compute
and then to remove
from the archiver, where
l is chosen randomly from
. Similar as for the bi-objective case, an exception can of course be made for the best found solutions for each objective value.
The approximation of the Hausdorff distance cannot be done as in (
12) any more. Instead, we choose the value of
as an approximation for
, which is motivated by Theorem 1.
The reset is completed if there exists an entry
a of the current archive
A and a candidate solution
p that dominates
a and
That is, the improvement is larger than for all objectives. It has been observed that if one only asks for an improvement in one objective (as done for the bi-objective case), too many resets are performed in particular for MOPs that contain a “flat” region of the Pareto front.
Note that none of these changes affects the statements made in Theorem 1. Hence, the statements of Theorem 1 also hold if Algorithm 4 is used for MOPs with objectives. We stress that this algorithm can of course also be used for the treatment of BOPs; however, in that case, Algorithm 3 seems to be better suited, since both distance considerations and Hausdorff approximation are more sophisticated.
Figure 7 shows an application of Algorithm 4 on the test function DTLZ2 with three objectives (concave and connected Pareto front) for
and
. The evolution of the approximated value
of the Hausdorff distance
together with the real value can be found in
Figure 8. Hereby, we have used ArchiveUpdateHD as the external archiver of NSGA-II. The same result could have been obtained using randomly chosen test points within the domain
Q, however, for a much higher amount of test points.
Figure 9 and
Figure 10 show the respective results for DTLZ7, whose Pareto front is disconnected and convex-concave. In all cases, the archiver is capable of finding evenly spread solutions along the Pareto front, and the value of
is already after some iterations quite close to the actual Hausdorff distance. In order to suitably handle weakly optimal solutions, we have used the approach we describe in the following remark.
Remark 3. It is known that distance-based archiving/selection for MOPs that contains weakly optimal solutions that are not optimal (dominance-resistant solutions) may lead to unsatisfactory results, since candidates may be included in the archive that are far away from the Pareto front. In [77], it has been suggested to consider the modified objectiveswhere is “small”, instead of the orginal objectives , . We have adopted this approach for the treatment of the ZDT and DTLZ functions in this work, using . Algorithm 4 |
Require: Problem (MOP), P: current population, : current archive, : current value of , N: upper bound for archive size |
Ensure: updated archive A, updated value of |
1: |
2: |
3: |
4: for all do |
5: if then |
6: |
7: end if |
8: for all do |
9: if then |
10: |
11: if , then ▹ reset and |
12: |
13: |
14: end if |
15: end if |
16: end for |
17: if then ▹ apply pruning |
18: |
19: |
20: compute as in (17) |
21: choose |
22: choose l randomly from |
23: |
24: end if |
25: end for |
26: return |
Remark 4. We finally stress that the archiver A only reaches the magnitude N if Δ (and hence ϵ) is chosen “small enough”, which does not represent a drawback in our opinion. In real-world applications, the values of Δ have a physical meaning. As a hypothetical example, consider that one objective in the design of the car is its maximal speed (e.g., ), and the decision maker considers two cars to have different maximal speeds if differs by at least 10 km/h. In this case is a suitable choice for ArchiveUpdateHD. Hence, depending on these values and the size of the Pareto front, it may happen that less than N elements are needed to suitably represent the solution set. In turn, if is (significantly) larger than the target values, this gives a hint to the decision maker that N has to be increased and that the computation has to be repeated in order to obtain a “complete” approximation. Figure 11 shows two results of ArchiveUpdateHD on CONV for two different starting values of Δ.