Next Article in Journal
Orlicz–Pettis Theorem through Summability Methods
Next Article in Special Issue
A New Hybrid Evolutionary Algorithm for the Treatment of Equality Constrained MOPs
Previous Article in Journal
A Study of Determinants and Inverses for Periodic Tridiagonal Toeplitz Matrices with Perturbed Corners Involving Mersenne Numbers
Previous Article in Special Issue
An Iterative Method Based on the Marginalized Particle Filter for Nonlinear B-Spline Data Approximation and Trajectory Optimization
Open AccessReview

The Averaged Hausdorff Distances in Multi-Objective Optimization: A Review

1
Departamento de Matemáticas, Pontificia Universidad Javeriana, Cra. 7 N. 40-62, Bogotá D.C. 111321, Colombia
2
Computer Science Department, CINVESTAV-IPN, Av. IPN 2508, Col. San Pedro Zacatenco, Mexico City 07360, Mexico
3
Dr. Rodolfo Quintero Ramirez Chair, UAM Cuajimalpa, Mexico City 05348, Mexico
*
Author to whom correspondence should be addressed.
Mathematics 2019, 7(10), 894; https://doi.org/10.3390/math7100894
Received: 27 August 2019 / Revised: 10 September 2019 / Accepted: 17 September 2019 / Published: 24 September 2019
(This article belongs to the Special Issue Recent Trends in Multiobjective Optimization and Optimal Control)

Abstract

A brief but comprehensive review of the averaged Hausdorff distances that have recently been introduced as quality indicators in multi-objective optimization problems (MOPs) is presented. First, we introduce all the necessary preliminaries, definitions, and known properties of these distances in order to provide a stat-of-the-art overview of their behavior from a theoretical point of view. The presentation treats separately the definitions of the ( p , q ) -distances GD p , q , IGD p , q , and Δ p , q for finite sets and their generalization for arbitrary measurable sets that covers as an important example the case of continuous sets. Among the presented results, we highlight the rigorous consideration of metric properties of these definitions, including a proof of the triangle inequality for distances between disjoint subsets when p , q 1 , and the study of the behavior of associated indicators with respect to the notion of compliance to Pareto optimality. Illustration of these results in particular situations are also provided. Finally, we discuss a collection of examples and numerical results obtained for the discrete and continuous incarnations of these distances that allow for an evaluation of their usefulness in concrete situations and for some interesting conclusions at the end, justifying their use and further study.
Keywords: Averaged Hausdorff distance; evolutionary multi-objective optimization; Pareto compliance; performance indicator; power means Averaged Hausdorff distance; evolutionary multi-objective optimization; Pareto compliance; performance indicator; power means

1. Introduction

In many real-world applications, the problem of concurrent or simultaneous optimization of several objectives is an essential task known as a multi-objective optimization problem (MOP). One important problem in multi-objective optimization is to compute a suitable finite size approximation of the solution set of a given MOP, the so-called Pareto set and its image, the Pareto front.
The Hausdorff distance d H (e.g., Reference [1]) measures how far two subsets of a metric space are from each other. Due to its properties, it is frequently used in many research areas such as computer vision [2,3,4], fractal geometry [5], the numerical computation of attractors in dynamical systems [6,7,8], or convergence of multi-objective algorithms to the Pareto set/front of a given multi-objective optimization problem [9,10,11,12,13,14,15]. One possible drawback of the classical Hausdorff distance, however, is that it punishes single outliers which leads to inequitable performance evaluations in some cases. As one example, we mention here multi-objective evolutionary algorithms. On the one hand, such algorithms are known to be very effective in the (global) approximation of the Pareto set/front. On the other hand, it is also known that the final approximations (populations) may contain some outliers (e.g., Reference [16]). For such cases, the Hausdorff distance may indicate a “bad” match of population and Pareto set/front, while the approximation quality may be indeed “good”. To avoid exactly this problem, Schütze et al. introduced the averaged Hausdorff distance Δ p in Reference [16], but the initial definition only works for finite approximations of the solution set and does not behave as a proper metric in the formal mathematical sense. In Reference [17], the indicator Δ p , q has been proposed by the first two authors of this paper. Δ p , q is an averaged Hausdorff distance that fixes the metric behavior of Δ p . Later, in Reference [18], a broader definition was given on metric measure spaces, suitable for the consideration of continuous approximations of the solution set. Moreover, this generalized indicator Δ p , q preserves the nice metric properties of the initial finite case and reduces to it when using the standard discrete measure.
While the averaged Hausdorff distance has so far mostly been used for performance assessment of multi-objective evolutionary algorithms (using benchmark functions), it has also been used on MOPs coming from real-world problems including the multi-objective software next release problem [19], arc routing problems [20], power flow problems [21], engineering design problems [22], foreground detection [23], and contract design [24]. Several other indicators have also been proposed in the literature, like the hypervolume indicator or R indicators, each one with its own advantages and drawbacks, but their consideration is beyond the scope of this work. Information concerning other indicators can be found, e.g., in References [25,26,27].
The material reviewed in this work is based on recently published works [17,18,28]. The remainder of the document is organized as follows: in Section 2, we will briefly state the required background for MOPs and power means. In Section 3, we review the p-averaged Hausdorff distance Δ p . In Section 4, we will discuss its generalization, the ( p , q ) -averaged Hausdorff distance Δ p , q , explaining individually the finite and continuous cases. In Section 5, we will consider some aspects of the metric properties of Δ p and Δ p , q . In Section 6, we study the Pareto compliance of the performance indicators related to Δ p and Δ p , q . In Section 7, we will present some examples and numerical experiments. Finally, in Section 8, we will draw our conclusions and will discuss possible paths for future research in this direction.

2. Preliminaries

In this review, we introduce tools from a metric perspective that deal with two related contexts: distances between finite subsets of a metric space and distances between general measurable subsets of a metric measure space. The second context actually contains the first, but we deal separately with both of them, starting with the simpler setting of finite subsets before passing to the more general situation of arbitrary measurable sets that also contains the important special case of continuous sets. To emphasize each context, we use the convention that general sets will be denoted by X, Y, and Z, but when they are finite, the labels A, B, and C will be used.

2.1. Multi-Objective Optimization

First, we briefly present some basic aspects of multi-objective optimization problems (MOPs) required for the understanding of this paper. For a more thorough discussion, we refer the interested reader, e.g., to References [12,29,30].
A continuous MOP is rigorously formalized as the minimization of an appropriate function:
min x Q F ( x ) ,
where F denotes a vector-valued function with components f i : Q R , for i = 1 , k , called objective functions. Explicitly,
F : Q R k , x F ( x ) : = ( f 1 ( x ) , , f k ( x ) ) .
The optimality of a candidate solution to a MOP depends on a dominance relation [31] given in terms of the partial order introduced below.
Definition 1.
For x , y Q , the partial Pareto ordering ⪯ associated with the MOP determined by F is defined as
x y i f a n d o n l y i f f i ( x ) f i ( y ) , f o r a l l i = 1 , , k .
For x , y , z Q and X , Y Q , the following notions of dominance () and non-dominance () are standard in this context:
x is dominated by y , written y x if y x and F ( x ) F ( y ) . z is dominated by X , written X z if x z for some x X , otherwise X z . X is dominated by Y , written Y X if x X y Y such that y x , otherwise Y X .
In addition, x Q is called a Pareto-optimal point if it is nondominated, i.e., y Q with y x . Finally, the Pareto set P Q consists of all Pareto-optimal points and the Pareto front is defined as its image F ( P ) R k .
MOPs commonly possess the important characteristic that, when mild smoothness conditions are fulfilled, the solution (or Pareto) set P and its image the Pareto front F ( P ) R k consist of d dimensional subsets for d = k 1 (or even less) when the problem involves k objective functions ([32]).
As an example, let us describe a simple unconstrained MOP [33,34] given by
f 1 , , f k : R n R f i ( x ) = j = 1 n ( x j a j i ) 2 ,
where a i = ( a 1 i , , a n i ) R n and i = 1 , , k . The a i ’s correspond to the minimizers of each quadratic objective f i , and the Pareto set of this problem consists of a ( k 1 ) simplex containing all the a i ’s as vertices, i.e.,
simp k 1 : = simp ( a 1 , , a k ) = i = 1 k μ i a i : μ 1 , , μ k 0 and i = 1 k μ i = 1 .
In the particular case when n = 1 , k = 2 , a 1 = 0 , and a 2 = 2 , the problem becomes
F : R R 2 F ( x ) = ( x 2 , ( x 2 ) 2 ) .
This is the so-called Schaffers problem [35]. Figure 1 illustrates the objectives f 1 and f 2 , and the Pareto front F ( P ) for this MOP. In this case, the Pareto set corresponds to P = [ 0 , 2 ] and the Pareto front is a continuous convex curve in R 2 joining ( 0 , 4 ) with ( 4 , 0 ) .
In many real-world applications, MOPs arise naturally. As one example, in almost all scheduling problems (e.g., References [36,37,38,39,40,41]), the total execution time (make-span) is of primary interest. However, the consideration of this objective is in many cases not enough since other quantities such as the tardiness or the energy consumption also play an important role and can consequently, according to the given problem, also add objectives to the resulting multi-objective problem.
For the numerical treatment of MOPs, there exist already many established approaches. For instance, there are mathematical programming techniques [29,42], point-wise iterative methods that are capable of detecting single local solutions of a given MOP. Via use of a clever sequence of these resulting scalar objective optimization problems, a suitable finite size approximation of the entire Pareto front can be computed in certain cases [43,44,45,46]. Multi-objective continuation methods take advantage of the fact that the Pareto set at least locally forms a manifold [47,48,49,50,51,52]. Starting with an initial (local) solution, further candidates are computed along the Pareto set of the given MOP. All of these methods typically yield high convergence rates but are, in turn, of local nature. A possible alternative is given by set oriented methods such as subdivision and cell mapping techniques [53,54] and evolutionary algorithms [55,56,57,58,59] that are of global nature and are capable of computing a finite size approximation of the Pareto front in one single run.

2.2. Finite Power Means

A comprehensive reference on the theory and properties of means is given in Reference [60], where proofs of the statements presented here for finite power means and for integral power means in the following subsection can be found (see also Reference [18] for integral means).
For a finite set A [ 0 , ) and a nonzero real p, the p-average or the p power mean of A is given by
M p a A ( a ) : = 1 | A | a A a p 1 p ,
where | A | denotes the cardinality of A. The simpler notation M p ( A ) : = M p a A ( a ) will also be employed. Moreover, in order to simplify the forthcoming expressions, we introduce the abbreviation
a A a 1 | A | a A a
to denote the arithmetic mean of the elements of a finite set A R 0 .
It is well known that limit cases of power means recover familiar quantities, for example,
lim q 0 M p ( A ) = a A a 1 | A | 2 ,
is the standard geometric mean of the elements of A. The special case p = 1 corresponds to the harmonic mean,
harm ( A ) : = M 1 ( A ) .
Moreover, the p-average of any finite set can also be defined for any p in the extended real line R ¯ : = [ , ] by taking appropriate limits.
lim q M p ( A ) = max ( A ) , and lim q M p ( A ) = min ( A ) .
Proposition 1.
Let A and B be finite subsets of [ 0 , ) and p , q R ¯ be arbitrary constants. Then, the following properties hold for finite power means:
1.
M p ( A ) M p ( B ) .
2.
For p q : M p ( A ) M q ( A ) .
3.
For a matrix of nonnegative elements D = ( d a , b ) with a A and b B :
M p ( D ) : = M p a A M p b B ( d a , b ) = M p b B M p a A ( d a , b ) .
4.
For p 1 : M p ( { a + b a A , b B } ) M p ( A ) + M p ( B ) .
5.
For the harmonic mean: harm ( A ) | A | min ( A ) .

2.3. Integral Power Means in Measure Spaces

In order to present this part with sufficient generality, let us denote by ( S , μ ) a measure space. Let M ( S ) be the σ algebra of measurable subsets of S and M < ( S ) be the collection of those subsets with finite measure.
Now, we recall some fundamental properties of integral power means in this setting needed for the forthcoming sections. For p R \ { 0 } and a measurable function f : X S [ 0 , ) defined on a subset X M < ( S ) , the p power mean or p-average of f over X is given by
M p x X ( f ( x ) ) : = 1 μ ( X ) X f ( x ) p d μ 1 p .
For convenience, rhs of Equation (2) will be denoted simply as
X f p d μ 1 | X | X f ( x ) p d μ ,
where | X | : = μ ( X ) refers in this context to the measure of X and not to its cardinality as in the finite case. For brevity, when the measure μ employed is clear, d μ will be abbreviated by d x to highlight the variable being integrated. The shorthand M p ( f ( X ) ) : = M p x X ( f ( x ) ) will also be employed.
For p 1 , the integral p mean corresponds to M p ( f ( X ) ) = | X | 1 p f p , where · p is the standard p norm of the Lebesgue space L p ( X , μ ) . The cases p = ± can also be included by taking the limits p ± . In fact, since the essential supremum of the function f on X is f = ess sup x X f ( x ) , and when f is not identically zero its essential infimum is precisely 1 / f 1 = ess inf x X f ( x ) ; by calculating the limits, we obtain that
M x X ( f ( x ) ) : = lim p X f p d μ 1 p = f ,
and similarly,
M x X ( f ( x ) ) : = lim p X f p d μ 1 p = 1 f 1 .
Note that · corresponds to the norm of the space L ( X , μ ) . For p = 0 , it is possible to define M p as the integral generalization of the notion of geometric mean, and it is given explicitly by
M 0 x X ( f ( x ) ) : = exp X log f d μ .
Proposition 2.
For subsets X , Y M < ( R k ) , nonnegative measurable functions f , g : X [ 0 , ) , and any product-measurable function d : X × Y [ 0 , ) , the integral power mean M p satisfies that
1.
For p R ¯ , k [ 0 , ) : M p x X ( k ) = k and M p x X ( k f ( x ) ) = k M p x X ( f ( x ) ) .
2.
For p R ¯ : M p x X M p y Y ( d ( x , y ) ) = M p y Y M p x X ( d ( x , y ) ) .
3.
For p [ 1 , ] : M p x X ( f ( x ) + g ( x ) ) M p x X ( f ( x ) ) + M p x X ( g ( x ) ) .
4.
For p R ¯ and f g : M p x X ( f ( x ) ) M p x X ( g ( x ) ) .
5.
For p , q [ 0 , ] with p q : M p x X ( f ( x ) ) M q x X ( f ( x ) ) .

3. The p-Averaged Hausdorff Distance

When trying to measure the distance between subsets of Euclidean space or even an arbitrary metric space, a natural choice is the well-known Hausdorff distance d H that is extensively employed in many different contexts. However, its use is of limited practical value to measure the distance to the Pareto set/front in typical MOPs, such as stochastic search methods implemented by an evolutionary algorithm. This is due to the fact that these algorithms may produce a set of outliers that can be heavily punished by d H . As a partial remedy, the use of an averaged Hausdorff distance Δ p was first proposed in Reference [16] to replace d H .
Let d : S × S [ 0 , ) denote a distance function on a metric space S for which the standard properties of the identity of indiscernibles, nonnegativity, symmetry, and subadditivity (more commonly known as the triangle inequality) are satisfied.
Definition 2.
Given a point x 0 S and subsets X , Y S , we have
1.
A pointwise distance to sets: d ( x 0 , X ) : = inf { d ( x 0 , x ) x X } .
2.
A pre-distance between sets: d ( Y , X ) : = sup { d ( y , X ) y Y } .
3.
The Hausdorff distance between sets: d H ( X , Y ) : = max { d ( X , Y ) , d ( Y , X ) } .
For simplicity, throughout the text, the metric d can be assumed to be the standard Euclidean distance d ( x , y ) : = x y induced on some S R k by the Euclidean 2 norm of R k , but the theory carries over to any general metric space ( S , d ) .
Definition 3.
Let p N . For finite subsets A , B S , their (modified) p generational distance is
GD p ( A , B ) : = 1 | A | a A d ( a , B ) p 1 p ,
and their (modified) p inverted generational distance is
IGD p ( A , B ) : = 1 | B | b B d ( b , A ) p 1 p .
From them, the p averaged Hausdorff distance is obtained by taking the maximum
Δ p ( A , B ) : = max { GD p ( A , B ) , IGD p ( A , B ) } .
The indicators GD p and IGD p in Definition 3 correspond to simple adjustments to the definitions of the generational distance [61] and the inverted generational distance [62].
The standard Hausdorff distance is recoverable from Δ p by taking the limit lim p Δ p = d H , but for any finite value of p, the distance Δ p is obtained from standard p power means of all the distances employed to calculate the supremum in part 2 of Definition 2, which is needed to define d H .
The advantage of using Δ p as an indicator is that it does not immediately disqualify a few outliers in a candidate set, contrary to what d H does and that, among the possible configurations of (finite) candidate solutions to a MOP, it assigns lesser distances to the Pareto front to those solutions appearing evenly spread along its whole domain (see, e.g., Reference [63]). The behavior of Δ p as a quality indicator is studied, e.g., in References [16,28], and it corresponds to the particular case q of the results for general ( p , q ) -indicators presented in Section 6.
Concerning its metric properties, Δ p has the drawback of not being a proper metric in the usual sense because for any non-unit set A S the distance Δ p ( A , A ) > 0 . This problem will be fixed in the following section with a simple modification. Nevertheless, independently from that, for a positive number p, the distance Δ p does not satisfy the triangle inequality but only a weaker version of it. Indeed, as a consequence of Corollary 3, we have that
Δ p ( A , C ) N α Δ p ( A , C ) + Δ p ( B , C ) ,
where N = max { | A | , | B | , | C | } 1 and α = 1 / p .
For further details concerning Δ p , its properties, and its relation to other indicators, the reader can consult, e.g., References [16,63].

4. The (p,q)-Averaged Hausdorff Distance

To better evaluate the optimality of a certain candidate set to approximate the Pareto set/front of a MOP, several generalizations of the averaged Hausdorff distance Δ p have been recently introduced.

4.1. (p,q)-Distances between Finite Sets

Definition 4.
For p , q R \ { 0 } , the generational ( p , q ) -distance GD p , q ( A , B ) between two finite subsets A , B S is given by
GD p , q ( A , B ) : = a A b B d ( a , b ) q p q 1 p .
The distance GD p , q ( A , B ) can be extended for values of p = 0 or q = 0 , by taking the limits p 0 or q 0 , respectively. In such cases, properties of finite power means suggest the following definitions:
GD p , 0 ( A , B ) : = a A b B d ( a , b ) p | B | 2 1 p , when p 0 , GD 0 , q ( A , B ) : = a A b B d ( a , b ) q 1 q 1 | A | 2 , when q 0 , and GD 0 , 0 ( A , B ) : = a A b B d ( a , b ) 1 | B | 2 1 | A | 2 if p = q = 0 .
We can also calculate GD p , q when p ± or q ± by changing the corresponding sum with a minimum or a maximum according to the case. In particular, we have the nice relation
lim q GD p , q ( A , B ) = GD p ( A , B ) .
Note that the definition of GD p , q has two drawbacks, namely GD p , q ( A , B ) does not necessarily vanish if A = B and in general GD p , q ( A , B ) GD p , q ( B , A ) , hence it does not define a proper metric. In order to get one, a slight modification is needed.
Definition 5.
Let p , q R \ { 0 } . For finite subsets A , B S , their ( p , q ) -averaged Hausdorff distance is
Δ p , q ( A , B ) max { GD p , q ( A , B \ A ) , GD p , q ( B , A \ B ) } .
Notice that GD p ( A , B ) = GD p ( A , B \ A ) when A B = , thus using Equation (3) and Definition 5, we easily obtain
lim q Δ p , q ( A , B ) = Δ p ( A , B ) .
In this way, for finite and disjoint sets, the indicator Δ p , q is a generalization of Δ p . Similarly to the relation
GD p ( A , B ) = | A | 1 p D A B p ,
between the GD p ( A , B ) and the matrix p norm D A B p of the distance matrix D A B : = [ d ( a , b ) ] a , b for a A and b B , we also have the following relation between the ( p , q ) -generational distance GD p , q ( A , B ) and the matrix p , q norm D A B p , q , where the definition of the latter is precisely that of GD p , q but replacing all the normalized sums by standard ones ∑ (see, e.g., Reference [64]):
GD p , q ( A , B ) = M p a A M q b B ( d ( a , b ) ) = | A | 1 p | B | 1 q D A B p , q .
A useful property of the distance Δ p , q is that the parameters can be adjusted independently to achieve some desired spread of the archives by choosing an appropriate q and that they can be located with custom closeness to the Pareto front of a MOP by an adequate choice of p.

4.2. ( p , q ) -Distances between Measurable Sets

With the aid of Proposition 2, the results of the previous section can be generalized to subsets of a metric space ( S , d ) endowed with an appropriate measure μ . For concreteness, S can be taken to be a subset of R k carrying the metric induced from the Euclidean metric of R k and endowed with an appropriate non-null measure μ . Notice that, in our intended applications, μ will not be the restriction of the standard Lebesgue measure of R k to S for the simple reason that it can easily vanish as it happens on any hypersurface or lower dimensional subsets of R k . In this case, a lower dimensional measure is needed and alternatives like the Hausdorff measure on S can be used, since it gives rise to the standard notion of d dimensional volume for d submanifolds of R k . When these submanifolds are parametrized by functions from subsets of R d , the same volume will be obtained by a change of variable formulae from the standard Lebesgue measure on those subsets of R d .
A very important observation in this context is that any set-theoretic relation obtained from measure-related calculations needs to be understood to hold almost everywhere (a.e.). Therefore, for X , Y M < ( S ) , the statements X = Y or X Y mean that the relations hold a.e., i.e., μ { X Y } = 0 or μ { X Y } = 0 , respectively. In other words, in this setting, we will always identify X M < ( S ) with its equivalence class [ X ] : = { Y X = Y , a . e . } . This means that those classes will be regarded as the elements of M < ( S ) , removing the need to carry the abbreviation a.e. all the time. Henceforth, to simplify complicated formulae, d ( x , y ) will be shortened to d x , y .
Definition 6.
Let p , q R \ { 0 } . For finite-measure subsets X , Y M < ( S ) , their generational ( p , q ) -distance is given by
GD p , q ( X , Y ) M p x X M q y Y ( d x , y ) = X Y d x , y q d y p q d x 1 p .
The cases p < 0 or q < 0 are well defined only if X and Y are disjoint subsets.
Similarly to the finite case, GD p , q can be extended to values of p , q R ¯ , but there are two drawbacks: GD p , q ( X , X ) = 0 only if X is a unit-set or singleton, and GD p , q ( X , Y ) can differ from GD p , q ( Y , X ) . To fix this undesirable behavior, we repeat the strategy used in the finite case as follows.
Definition 7.
Let p , q R \ { 0 } . For finite-measure subsets X , Y M < ( S ) , their ( p , q ) -averaged Hausdorff distance is given by
Δ p , q ( X , Y ) max { GD p , q ( X , Y \ X ) , GD p , q ( Y , X \ Y ) } .
Remark 1.
In general, the ( p , q ) -distances are maps: M < ( S ) × M < ( S ) [ 0 , ) . On the collection of finite subsets of S , the standard counting measure can be taken as the underlying one needed for these measure-theoretic notions of GD p , q and Δ p , q , and in this case, these distances become precisely the finite-case distances given in Definitions 4 and 5.
Remark 2.
For disjoint subsets X and Y, Definition 5 in the finite case and Definition 7 above in the measurable case reduce to the simpler form
Δ p , q ( X , Y ) : = max { GD p , q ( X , Y ) , GD p , q ( Y , X ) } ,
which is the one we will actually use in most situations. The more general definition for non-disjoint subsets is given with the purpose that the distance so-defined changes continuously as one set approaches the other until their distance vanishes. In other words, the general definition allows the distance to become a continuous function with respect to the metric topology that it determines. Nevertheless, for practical purposes dealing with applications and for most of the results presented below, the simpler definition between disjoint subsets suffices.

5. Metric Properties

To explain some of the terminology used in this section, we recall to the reader that the standard triangle inequality for a distance function d : S × S [ 0 , ) is usually weakened in two different but related ways by postulating the existence of a constant C > 0 such that, for any points x , y , z S , one of the following conditions hold:
  • The C relaxed triangle inequality: d ( x , z ) C ( d ( x , y ) + d ( y , z ) ) .
  • The C inframetric inequality: d ( x , z ) C max { d ( x , y ) , d ( y , z ) } .
Since the second condition implies the first one by using the very same constant C > 0 and, reciprocally, the C relaxed triangle inequality implies the 2 C inframetric one, both conditions are equivalent for an appropriate choice of constants. A semimetric satisfying any one of these conditions will be simply called an inframetric.
For arbitrary measurable sets in S , the following results summarize the metric properties of GD p , q and Δ p , q . Using the counting measure, these properties also apply to finite sets. For more details, see Reference [17] in the finite case and Reference [18] in the generalized measure-theoretic context.
Theorem 1.
For p , q [ 1 , ] , the generational ( p , q ) -distance GD p , q is subadditive in M < ( S ) , i.e., for any X , Y , Z M < ( S ) , the triangle inequality holds true:
GD p , q ( X , Z ) GD p , q ( X , Y ) + GD p , q ( Y , Z ) .
Proof. 
The proof follows easily by simple steps using the properties in Proposition 2. We start from the standard triangle inequality for d ( · , · ) :
d x , z d x , y + d y , z ( x X , y Y , z Z ) ,
taking at both sides the q-average over Z and using 1–3 of Proposition 2 to arrive at
M q z Z ( d x , z ) M q z Z ( d x , y + d y , z ) M q z Z ( d x , y ) + M q z Z ( d y , z ) = d x , y + M q z Z ( d y , z ) .
Now, there are two independent cases for the parameters p , q [ 1 , ) . We explain here only the case p q , but the case q < p follows by similar arguments; see Thm. 2 in Reference [18]. Calculating the p-average over X at both sides of Equation (4) and using 1, 3, and 5 of Proposition 2, we get
M p x X M q z Z ( d x , z ) M p x X d x , y + M q z Z ( d y , z ) = M p x X ( d x , y ) + M q z Z ( d y , z ) .
Since the lhs of Equation (5) is GD p , q ( X , Z ) , after a further p-average over Y at both sides of Equation (5) and parts 1, 3, and 5 of Proposition 2, we obtain
GD p , q ( X , Z ) M p y Y M p x X ( d x , y ) + M q z Z ( d y , z ) = M p y Y M p x X ( d x , y ) + GD p , q ( Y , Z ) .
But from 2, 4, and 5 of Proposition 2, the first term at the rhs above satisfies
M p y Y M p x X ( d x , y ) = M p x X M p y Y ( d x , y ) M p x X M q y Y ( d x , y ) = GD p , q ( X , Y ) .
Corollary 1.
If p , q R \ { 0 } , the ( p , q ) -averaged Hausdorff distance Δ p , q is a semimetric on the space M < ( S ) of finite-measure subsets of S . Furthermore, if p , q [ 1 , ) the distance Δ p , q behaves as a proper metric when it is restricted to disjoint subsets of M < ( S ) .
Proof. 
From Definition 7, we obtain the relations Δ p , q ( · , · ) 0 and Δ p , q ( X , Y ) = Δ p , q ( Y , X ) for any X , Y M < ( S ) and all p , q R \ { 0 } . Moreover, from Definition 6, it follows that GD p , q ( X , Y \ X ) = 0 if and only if X = or Y X (hence, Y \ X = ). Therefore, for X , Y ,
Δ p , q ( X , Y ) = 0 X = Y ,
i.e., Δ p , q is a semimetric on the collection of finite-measure subsets M < ( S ) . Finally, for disjoint X and Y, it is clear that GD p , q ( X , Y \ X ) = G D p , q ( X , Y ) ; thus, by Theorem 1, the triangle inequality holds for both arguments inside the maximum that defines Δ p , q when p , q [ 1 , ) . This implies that the triangle inequality is also valid for Δ p , q . □
Theorem 2.
Let X , Y , Z M < ( S ) be subsets admitting positive constants r < R such that r d u , v R for any u X Y and v Y Z . Then, for all p , q R \ { 0 } , | p | , | q | 1 and at least one of them negative a relaxed triangle inequality holds for GD p , q , namely
GD p , q ( X , Z ) R 2 r 2 GD p , q ( X , Y ) + GD p , q ( Y , Z ) .
Proof. 
Step 1: Let p R \ { 0 } , and suppose that q < 0 . We will prove that
GD p , | q | ( X , Y ) R r GD p , q ( X , Y ) .
For all x X and y , z Y , we have r R d x , y d x , z R r . Thus,
R r Y Y d x , y d x , z | q | d y d z 1 | q | = Y d x , y | q | d y 1 | q | Y d x , z | q | d z 1 | q | .
Since q = | q | , this means that M | q | y Y ( d x , y ) R r M q y Y ( d x , y ) . Taking the p-average M p x X at both sides and from 1 and 4 of Proposition 2, we find M p x X M | q | y Y ( d x , y ) R r M p x X M q y Y ( d x , y ) , which is exactly Equation (6).
Step 2: Now, for q R \ { 0 } and p < 0 , we will prove that
GD | p | , q ( X , Y ) R r GD p , q ( X , Y ) .
By assumption, we have r R d x , y d u , y R r for any y Y and all x , u X . Similarly as before and using 1 and 4 of Proposition 2, we conclude from the rhs part that M q y Y ( d x , y ) R r M q y Y ( d u , y ) . However, since p = | p | , after taking a p-average of the quotient of means, it follows from
X M q y Y ( d x , y ) | p | d x 1 | p | X M q y Y ( d u , y ) p d u 1 | p | = X X M q y Y ( d x , y ) M q y Y ( d u , y ) | p | d x d u 1 | p | R r ,
that M | p | x X M q y Y ( d x , y ) R r M p x X M q y Y ( d x , y ) , which is now (7).
Step 3: The previous steps can be summarized in the expression
GD | p | , | q | ( X , Y ) R r GD | p | , q ( X , Y ) R 2 r 2 GD p , q ( X , Y ) .
Using again 4 of Proposition 2 and Definition 6, we get GD p , q ( X , Z ) GD | p | , | q | ( X , Z ) . From this, the subadditivity for GD | p | , | q | (Theorem 1), and Equation (8), we conclude
GD p , q ( X , Z ) GD | p | , | q | ( X , Y ) + GD | p | , | q | ( Y , Z ) R 2 r 2 GD p , q ( X , Y ) + GD p , q ( Y , Z ) .
 □
Remark 3.
For parameters ( p , q ) R 2 that lie in the orange or blue sectors in Figure 2, the distance GD p , q fulfills a C relaxed triangle inequality for a constant C = R 2 / r 2 only if the condition r d u , v R holds for all u X Y and v Y Z . On bounded and topologically separated sets (i.e., not having common limit points), this condition always holds, and on them, Δ p , q becomes an inframetric as explained below.
Corollary 2.
Under the same hypotheses of Theorem 2, the ( p , q ) -averaged Hausdorff distance Δ p , q satisfies
Δ p , q ( X , Z ) R 2 r 2 Δ p , q ( X , Y ) + Δ p , q ( Y , Z ) .
Proof. 
It is immediate using Theorem 2 and Definition 7. □
When the involved sets are finite, a generally sharper inframetric relation holds. For emphasis, we employ in this context the notation A , B , C for those subsets of S .
Theorem 3.
If p , q R and | p | , | q | > 1 , the ( p , q ) -distance GD p , q satisfies the relaxed triangle inequality
GD p , q ( A , C ) N α GD p , q ( A , B ) + GD p , q ( B , C ) ,
for all finite subsets A , B , C S , where N : = max { | A | , | B | , | C | } 1 and α : = | p | 1 + | q | 1 .
Proof. 
For arbitrary p 0 , let us assume that q < 0 , so that | q | = q . We can write
GD p , | q | ( A , B ) = a A b B d ( a , b ) q 1 p q 1 p = a A | B | 2 harm b B d ( a , b ) q p q 1 p ,
which, when combined with property 5 of Proposition 1, yields
GD p , | q | ( A , B ) a A | B | 1 min b B d ( a , b ) q p q 1 p | B | 1 | q | a A b B d ( a , b ) q p q 1 p = | B | 1 | q | GD p , q ( A , B ) .
A similar relation is true for any q 0 if p < 0 . In conclusion, GD | p | , | q | ( A , B ) N α GD p , q ( A , B ) , where
α : = | min { p , q } | 1 if p q < 0 , | p | 1 + | q | 1 if p < 0 , q < 0 .
If N α does not need to be sharp, α can always be chosen to take the larger value | p | 1 + | q | 1 .
Now, for | p | , | q | 1 , the final result follows from the triangle inequality for GD | p | , | q | :
GD p , q ( A , C ) GD | p | , | q | ( A , C ) N α GD p , q ( A , B ) + GD p , q ( B , C ) .
Corollary 3.
If p , q R and | p | , | q | 1 , the ( p , q ) -distance Δ p , q satisfies the relaxed triangle inequality:
Δ p , q ( A , C ) N α Δ p , q ( A , B ) + Δ p , q ( B , C ) ,
for all finite subsets A , B , C S , with N : = max { | A | , | B | , | C | } 1 , and α : = | p | 1 + | q | 1 .
Proof. 
The corollary follows immediately from Theorem 3 and Definition 5. □
To conclude this section, we return to the general setting of arbitrary measurable sets to explain the behavior of Δ p , q when changing the value of its parameters p and q.
Theorem 4.
Let X , Y M < ( S ) and suppose that p , p , q , q R ¯ satisfy p p and q q . Then,
Δ p , q ( X , Y ) Δ p , q ( X , Y ) a n d Δ p , q ( X , Y ) Δ p , q ( X , Y ) .
Proof. 
It follows easily from part 5 of Proposition 2 and Definition 7. □

6. The (p,q)-Distances as Quality Indicators

Let Q be a decision space Q and F : Q R k be a multi-objective function on it, of which the associated MOP consists in the simultaneous minimization of its k component functions f 1 , , f k . A candidate solution to this problem is Pareto-optimal if all elements of its image in F ( Q ) R k are nondominated in the sense of Pareto [31]; see Definition 1. For the forthcoming discussion, let us introduce the following abbreviated and useful notation. For X , Y Q and any z Q , we define the following
X z : = { x X x z } , X Y : = { x X y Y : x y } , X z : = { x X x z } , X Y { x X y Y : x y } .
From these definitions, it follows that, for arbitrary z Q and X , Y Q , there are partitions:
X = X z X z , and X = X Y X Y ,
where ⊔ stands for the disjoint union of subsets. A similar notation with the subindices ≺, ≻, and ⪰ can also be used in an analogous way. Let us recall that an archive X Q is, by definition, a subset of mutually non-dominated points; therefore, for any x , x X , the condition x x implies x = x . This basic property implies that F : Q R k is a bijection when restricted to any archive X Q and, therefore, the points in F ( X ) F ( Q ) can be univocally labeled by the elements of X. Moreover, for a finite archive A Q , both sets have the same number of elements | A | = | F ( A ) | .
Now, we introduce a couple of strengthened notions of dominance between sets (archives) that are required for the validity of most of the results in this section.
Definition 8.
An archive X is well-dominated by an archive Y if
1.
X is dominated by Y, written Y X , i.e., x X , y Y s.t. y x , and
2.
Y consists only of dominating points of X, i.e., y Y , x X s.t. y x .
Moreover, X is said to be strictly well dominated by Y if
3.
y Y \ X , x X \ Y such that y x .
For an archive X Q , the GD p , q , IGD p , q , and Δ p , q quality (or performance) indicators assigned to it will be defined as the distance of its image F ( X ) to the Pareto front F ( P ) , i.e.,
I p , q GD ( X ) : = GD p , q ( F ( X ) , F ( P ) ) , I p , q IGD ( X ) : = IGD p , q ( F ( X ) , F ( P ) ) , and I p , q Δ ( X ) : = Δ p , q ( F ( X ) , F ( P ) ) .
In this section, we study the behavior of I p , q GD , I p , q IGD , and I p , q Δ as performance indicators. An example of a weakly Pareto-compliant performance indicator is the Degree of Approximation (DOA; see Reference [10]).

6.1. Pareto Compliance of ( p , q ) -Indicators in the Finite Case

In order to obtain general conclusions on the features of the averaged Hausdorff distance Δ p , q as a quality indicator, we consider first the behavior of GD p , q . For additional details on the material presented in this section and other related results in the context of the p-averaged Hausdorff Distance Δ p , the reader is referred to Reference [28].
For the following statements, we will abbreviate δ q ( a , B ) : = ( b B d ( a , b ) q ) 1 q . Clearly, with this notation, I p , q GD ( A ) = ( a A δ q ( F ( a ) , F ( P ) ) p ) 1 p , where in the averaged sum , we are labeling the points in F ( A ) by the elements of the archive A, taking advantage of the fact that | A | = | F ( A ) | , as it also will be done with all the averages in this section.
Theorem 5.
Let A , B Q be finite archives with A strictly well dominated by B. For all a A and b B ,
1.
b a implies that δ q ( F ( b ) , F ( P ) ) < δ q ( F ( a ) , F ( P ) ) ;
2.
b a implies that | B a | | B | | A b | | A | (or equivalently an strict equality);
then, I p , q GD ( B ) < I p , q GD ( A ) .
Proof. 
By condition 1, for all a A and b B a , the inequality δ q ( F ( b ) , F ( P ) ) p δ q ( F ( a ) , F ( P ) ) p holds true. After averaging over all b B a at both sides, we have
b B a δ q ( F ( b ) , F ( P ) ) p δ q ( F ( a ) , F ( P ) ) p ,
and averaging once again over all a A produces
a A b B a δ q ( F ( b ) , F ( P ) ) p a A δ q ( F ( a ) , F ( P ) ) p .
From property 2 and noticing that each b B appears | A b | times in the initial sum, the lhs becomes
a A b B a 1 | B a | δ q ( F ( b ) , F ( P ) ) p a A b B a | A | | B | 1 | A b | δ q ( F ( b ) , F ( P ) ) p = b B δ q ( F ( b ) , F ( P ) ) p .
Returning to Equation (9), we conclude that I p , q GD ( B ) I p , q GD ( A ) . Lastly, part 3 of Definition 8 for strictly well-dominated sets guarantees that this is an strict inequality, proving the assertion.  □
Remark 4.
Since we are dealing with finite archives, condition 2 of Theorem 5 regarding the relative size of some of their parts is equivalent to the condition | F ( B a ) | / | F ( B ) | | F ( A b ) | / | F ( A ) | regarding the relative size of their images. This is not necessarily the case in the context of measurable subsets, but see Remark 5.
Figure 3 shows examples where Theorem 5 holds with q . In this case, the q-averaged distance δ q ( a , B ) becomes the standard distance d ( a , B ) between a point and a set.
For the inverted generational distance IGD p , q in the finite case, we provide here two useful results without explicit proofs. The necessary steps are similar to the arguments used to prove the analogous statements for IGD p in Prop. 3.8 of Reference [28] and Thm. 3.9 in Reference [28]. Those statements correspond here to the limit q , and the main difference in the proofs is that the Euclidean distante d ( a , B ) needs to be changed everywhere by the q-average δ q ( a , B ) , as it was done above for the proof of Theorem 5 that generalizes the proof of Thm. 3.4 in Reference [28]. The reader can also find there additional remarks on similar hypotheses to the ones needed for Theorem 6 below.
Proposition 3.
Let A , B Q be finite and strictly well-dominated archives with B A such that for all a A , b B , and x P :
b a implies δ q ( F ( b ) , F ( x ) ) < δ q ( F ( a ) , F ( x ) ) ;
then, I p , q IGD ( B ) < I p , q IGD ( A ) .
Figure 4 illustrates two situations where the hypotheses of Proposition 3 are satisfied. Now, to state a more general result concerning the Pareto compliance of the IGD p , q indicator, we will further abbreviate the minimal p-average of distances δ q ( F ( a ) , F ( P B ) ) by
δ A : = min a A x P B δ q ( F ( x ) , F ( a ) ) p 1 p = min a A IGD p , q ( F ( a ) , F ( P B ) ) .
Theorem 6.
Let A , B Q be finite and strictly well-dominated archives such that B A . If at least one of the following conditions is satisfied,
1.
a A , b B : b a implies IGD p , q ( F ( b ) , F ( P B ) ) < IGD p , q ( F ( a ) , F ( P B ) ) ;
2.
a 0 A such that x P B : a 0 arg min a A δ q ( F ( x ) , F ( a ) ) ;
3.
x P B : δ q ( F ( A ) , F ( x ) ) = δ A ;
then I p , q IGD ( B ) < I p , q IGD ( A ) .
Finally, a general statement on the Pareto compliance of the finite case of the ( p , q ) -averaged Hausdorff distance Δ p , q follows as a consequence of Theorems 5 and 6.
Theorem 7.
Let A , B Q be finite and well-dominated archives such that B A . If for all a A , b B :
b a implies | B a | | B | = | A b | | A | ,
and at least one of the following conditions is satisfied:
1.
a A , b B , x P : b a implies δ q ( F ( b ) , F ( x ) ) < δ q ( F ( a ) , F ( x ) ) ;
2.
a 0 A such that x P B : a 0 arg min a A δ q ( F ( x ) , F ( a ) ) ;
3.
x P B : δ q ( F ( A ) , F ( x ) ) = δ A ;
then, I p , q Δ ( B ) < I p , q Δ ( A ) .
Figure 5 illustrates four situations where Theorems 6 and 7 apply with very large q. In the first row, the left diagram is a modification of the second case in Figure 4 where condition 1 holds. In the right diagram, the diamond lying at the lower left corner of F ( A ) represents the image F ( a 0 ) of a point a 0 satisfying condition 2. Finally, both diagrams in the second row exhibit cases where the points of F ( P ) are equidistant to corresponding points in F ( A ) , making condition 3 valid, with δ A being this distance.

6.2. Pareto Compliance of ( p , q ) -Indicators in the General Case

We consider now the behavior of the generalized GD p , q distance with respect to the Pareto-compliance, concentrating on the most important aspects that describe its characteristics and using similar hypotheses to the ones needed in the previous section for Δ p , q in the finite case.
Here, we will continue to assume that the decision space Q with objective function F : Q R k defining the MOP under consideration has a Pareto set P Q with corresponding Pareto front F ( P ) F ( Q ) . Also, we assume that the objective space F ( Q ) R k carries a metric d that, for simplicity, can be taken to be the one inherited from the Euclidean distance d ( · , · ) in R k . In addition, to define the ( p , q ) -indicators on MOPs that require general non-finite sets, we need a measure space ( S , μ ) , that here will be taken to be S F ( Q ) endowed with a non-null measure μ according to the comments at the beginning of Section 4.2. In this context X, Y Q will denote arbitrary subsets such that F ( X ) , F ( Y ) F ( Q ) are measurable with non-null and finite measures.
Remark 5.
Recall that, here, | F ( X ) | = μ ( F ( X ) ) denotes the measure of F ( X ) S . In this context, Q will not be asked to carry a measure and the notation | X | will have no a priori meaning for X Q . Nevertheless, it is possible to induce a measure on those subsets of Q where F is bijective by taking the pullback μ * of μ to them, making the identity | X | = μ * ( X ) : = μ ( X ) = | F ( X ) | trivially true. This can be done for all archives but not for subsets where F is not bijective. When it is Q that carries a measure, a push-forward measure can be always defined on its image F ( Q ) , making this identity true for all sets. This was implicitly assumed in the presentation provided in Reference [18] (Section 3.4). For clarity, we avoid here this identification and state everything from the assumption that the measure μ is defined only on S : = F ( Q ) .
Before stating the complete result, let us recall that a partition of a set X is a collection of disjoint and non-empty subsets of X whose union is the whole of X and a partition of an archive X Q induces a partition of F ( X ) F ( Q ) by the bijectivity of F restricted to X. For convenience, we abbreviate the measure-theoretic q-averaged distance from a point F ( x ) F ( Q ) to a set F ( Z ) F ( Q ) by δ q ( F ( x ) , F ( Z ) ) : = ( v F ( Z ) d ( F ( x ) , v ) q ) 1 q .
Theorem 8.
For p , q R ¯ , let X , Y Q denote archives of which the images F ( X ) and F ( Y ) are of non-null finite measures in F ( Q ) . Moreover, assume that
1.
there exist finite partitions X = i = 1 m X i and Y = i = 1 m Y i such that i { 1 , , m } :
(a)
F ( X i ) F ( X ) and F ( Y i ) F ( Y ) are subsets of non-null finite measure in F ( Q ) ;
(b)
x X i , y Y i : x y ;
2.
x X , y Y : x y implies δ q ( F ( x ) , F ( P ) ) δ q ( F ( y ) , F ( P ) ) ;
then, I p , q GD ( X ) I p , q GD ( Y ) .
Proof. 
By 1 ( a ) of Theorem 8, the sets X and Y can be subdivided into the same number m of subsets, and by 1 ( b ) , if x X i and y Y i for any i { 1 , , m } , then δ q ( F ( x ) , F ( P ) ) δ q ( F ( y ) , F ( P ) ) . Therefore, we can take successive integral p-averages over F ( X i ) and, afterwards, over F ( Y i ) at both sides of this inequality to find that, for each i, we have
a i p : = 1 | F ( X i ) | F ( X i ) δ q ( v , F ( P ) ) p d v 1 | F ( Y i ) | F ( Y i ) δ q ( v , F ( P ) ) p d v = : b i p .
For those i { 1 , , m } violating the inequality | X i | | X | | Y i | | Y | , we subdivide X i into a sufficiently large partition of m i subsets X i , 1 , X i , 2 , , X i , m i , with images by F of non-null finite measure, so as to guarantee that, for all j { 1 , , m } , we get
w i , j : = | F ( X i , j ) | | F ( X ) | | F ( Y i ) | | F ( Y ) | = : w ˜ i .
Notice that this is indeed possible because each F ( X i ) is of non-null finite measure. Since x X i , j , y Y i , we have x y , an inequality similar to Equation (10) also holds for them, i.e.,
a i , j p : = 1 | F ( X i , j ) | F ( X i , j ) δ q ( v , F ( P ) ) p d v 1 | F ( Y i ) | F ( Y i ) δ q ( v , F ( P ) ) p d v = : b i p ,
for all i { 1 , , n } , j { 1 , , m i } . However, | F ( X ) | = i = 1 m | F ( X i ) | , where | F ( X i ) | = j = 1 m i | F ( X i , j ) | and | F ( Y ) | = i = 1 m | F ( Y i ) | . Therefore, with the notation of Equation (11), a simple calculation shows that i = 1 m j = 1 m i w i , j = i = 1 m w ˜ i = 1 , implying that w i , j and w ˜ i are normalized weights useful for weighted averages. Since 0 a i , j b i and 0 w i , j w ˜ i 1 , simple properties of discrete weighted power mean imply the inequality i = 1 m j = 1 m i w i , j a i , j p i = 1 m w ˜ i b i p . Thus, we can finally write
I p , q GD ( F ( Y ) ) p = 1 | F ( X ) | i = 1 m j = 1 m i F ( X i , j ) δ q ( v , F ( P ) ) p d v = i = 1 m j = 1 m j | F ( X i , j ) | | F ( X ) | a i , j p = i = 1 m j = 1 m i w i , j a i , j p i = 1 m w ˜ i b i p = i = 1 m | F ( Y i ) | | F ( Y ) | b i p = 1 | F ( Y ) | i = 1 m F ( Y i ) δ q ( v , F ( P ) ) p d v = I p , q GD ( Y ) p .
 □
Remark 6.
From condition 1 of Theorem 8, it follows the simpler (and somewhat weaker) dominance conditions:
( a )
X Y (i.e., y Y , x X such that x y ), and
( b )
x X , y Y such that x y .
For simple situations where ( a ) and ( b ) are valid, the partitions needed for part 1 of Theorem 8 are not difficult to find; however, this is not always possible as the right side of Figure 6 indicates. Indeed, Figure 6 presents some examples where ( a ) and ( b ) hold true, but I p , q GD ( X ) I p , q GD ( Y ) can be both true (left side) and false (right side). Furthermore, it is possible to show that X and Y comply (left side) and do not comply (right side) with condition 1 of Theorem 8, respectively.
Remark 7.
An important advantage of using GD p , q over GD p is that condition 2 of Theorem 8 provides the possibility of choosing an appropriate q R ¯ for which the condition δ q ( F ( x ) , F ( P ) ) δ q ( F ( y ) , F ( P ) ) holds when x y , ensuring in this way the compliance to Pareto optimality for GD p , q . This freedom is lacking for GD p because, in the limit q , the distance δ q ( F ( x ) , F ( P ) ) becomes the standard distance d ( F ( x ) , F ( P ) ) , which does not allow for any choice.

7. Examples and Numerical Experiments

In this section, we present some numerical experiments involving finite sets first, and afterwards, we study the case of continuous sets.

7.1. Working with Δ p , q over Finite Sets

Let us take a hypothetical Pareto front P given by the line segment from ( 0 , 1 ) to ( 1 , 0 ) in R 2 , i.e., the set of all points
( t , 1 t ) R 2 , for 0 t 1 .
This is the same example considered in Reference [16] p. 506 and enables us to make a comparison with values of Δ p . In order to use the finite version of Δ p , q , we discretize P by taking 11 uniformly distributed points; we call this set P . We assume two archives: X 1 is obtained from P by changing ( 0 , 1 ) for ( 0 , 10 ) , including an outlier, and by adding 1 / 10 to the remaining ordinates. X 2 is obtained from P by adding 5 to each ordinate. See Figure 7.
As explained in Section 3, we know that
Δ ( A , B ) lim p Δ p ( A , B )
coincides with the standard Hausdorff distance d H . In this case, we obtained
Δ 1 ( P , X 1 ) = 0.9091 , Δ 1 ( P , X 2 ) = 4.5412 , d H ( P , X 1 ) = 9 , d H ( P , X 2 ) = 5 ;
and according to Theorem 4 and Reference [16] p. 512, these values must increase as p increases.
Table 1 and Table 2 show that we can find values of p and q such that the ( p , q ) -averaged distance does not punish heavily the outliers, for example, p = q = 1 or p = 1 and q = 1 . We remark that the values of Δ p , q ( P , X 1 ) do not present a significative change under variations of q 1 for a fixed p. Thus, it is possible to work with q = 1 , in which case Δ p , q is a metric according to Corollary 1, and to still obtain values close to the ones given by the inframetric Δ p , with the same p 1 .
For large values of p, the behavior of Δ p , q presents the same disadvantages of Δ p or of the standard Hausdorff distance. For example, in Table 1 and Table 2, it can be observed that all distances for p 5 are useless because they imply that the distance from the discrete Pareto front P to the archive X 1 is larger than its distance to the archive X 2 . Figure 7 suggests that this is an undesirable outcome.
Table 3 shows that Δ p , q is close to a metric when q 1 and p 1 . The percentage of triangle inequality violations decreases as p increases or q decreases.

7.2. Optimal Archives for Discretized Spherical Pareto Fronts

We now consider two standard Pareto fronts: The spheric convex and spheric concave quarter-circles, see Figure 8 and Figure 9.
P 1 = ( cos ( θ ) + 1 , sin ( θ ) + 1 ) : π θ π 2 , P 2 = ( cos ( θ ) , sin ( θ ) ) : 0 θ π 2 .
To numerically find the optimal Δ p , q archive of size M, we discretized the Pareto front with 1000 equidistant points (which is an acceptable discretization according to Reference [63] p. 603) and randomly chose an initial M sized archive. Then, we used a random-walk (or step climber) evolutionary algorithm, moving one point at a time. Finally, we refined the optimal archive with the “evenly spaced” construction suggested by Reference [63] p. 607.
When finding optimal Δ p , q archives, our numerical experiments suggest a clear geometrical influence of the parameters p and q. For values of p in ( , 1 ) , the optimal archive sets are basically the same. When q [ 1 , 1 ] increases, the optimal archive tends to lose dispersion, converging to one point. When q 1 , the optimal archive collapses to one point, and when q ( , 1 ] , the corresponding optimal archives are basically the same (see Figure 10). When p 1 increases, the optimal archive moves away from the Pareto set (see Figure 11).
The following Figure 10 and Figure 11 show certain “optimal” archives A for the Pareto front P 1 in Equation (12), where the optimality means that the distance Δ p , q ( X , P 1 ) is minimum when X = A . Because of the choice of the parameters p and q, this solution is clearly different from the one shown in the Figure 8.

7.3. Optimal Archives for Disconnected and Discretized Pareto Sets

In this section, we present the optimal Δ p , q archives for a disconnected step Pareto front:
P 3 ( s , γ ) = t , 1 γ t + ( γ 1 ) s t s : 0 t 1 ,
where s is the number of steps, γ > 0 is a small constant responsible for the step’s twist, and · stands for the integer part function.
Figure 12 shows numerical optimal Δ 1 , 1 archives of sizes 20. The archive coordinates reveal that
A P 3 5 , 1 10 = ,
i.e., the optimal archive points do not lie over the Pareto front but they are so close to it that this is hardly noticeable. It is also evident that the archives are evenly distributed along the Pareto front.

7.4. General Example for Continuous Sets

In this first example, we are going to construct simple and illustrative continuous sets A and B. Let A be the straight segment in R 2 from a = ( 1 , 0 ) to b = ( 1 , 0 ) , that is
A = a b ¯ .
For a small positive ε > 0 and a variable δ > 0 , let B δ R 2 be the set given by the following union of straight segments
B δ = c d δ ¯ e δ f δ ¯ g δ h ¯ ,
where c = ( 1 , ε ) , d δ = ( δ , ε ) , e δ = ( δ , 1 ) , f δ = ( δ , 1 ) , g δ = ( δ , ε ) , and h = ( 1 , ε ) . We can regard the set B δ as a continuous approximation of A, where the central segment e δ f δ ¯ can be seen as the outlier.
In the following Figure 13, we can see the sets A and B δ for ε = 0.10 and δ = 0.10 , 0.20 . According to Table 4, as δ decreases, the Δ p , q distance between the approximation B δ and the set A also decreases.
We remark that the classical Hausdorff distance between A and B δ produces the value 1 for any δ > 0 . Thus, by working with the ( p , q ) -distance instead of d H , we can detect “better” approximations.

7.5. Approximating Pareto Set and Front of a MOP

Finally, we address the problem to approximate the Pareto sets and fronts of given MOPs. As an example, we will consider the bi-objective Lamé super-sphere problem [32] which is defined as follows:
min x F : R n R 2 ,
where F ( x ) = ( f 1 ( x ) , f 2 ( x ) ) is defined as
f 1 ( x ) = 1 n i = 1 n x i 2 γ 2 and f 2 ( x ) = 1 n i = 1 n ( x i 1 ) 2 γ 2
where x R n and γ R . For n = 2 , the Pareto sets and fronts of this problem are shown in Figure 14 and Figure 15 for γ = 2 and γ = 1 / 2 , respectively.
In a first step, we discuss the principle difference of discrete and continuous archives when approximating the Pareto set/front on a hypothetical example. For this, we assume that we are given the 5-element archive A = { x 1 , , x 5 } R 2 ; those elements are given by
x 1 = ( 0.02 , 0.03 ) , x 2 = ( 0.29 , 0.20 ) , x 3 = ( 0.41 , 0.49 ) , x 4 = ( 0.70 , 0.62 ) , x 5 = ( 1.02 , 0.98 ) .
Hence, we can see A as a 5-element approximation of the Pareto set and its image F ( A ) as a 5-element approximation of the Pareto front. Now, instead of A, we may use a polygonal curve that is defined by A:
B : = x 1 x 2 ¯ x 4 x 5 ¯ .
In the following, we will call A a discrete archive while we call the polygon approximation B the continuous archive. Figure 16 and Figure 17 show the approximations A and B as well as their images F ( A ) and F ( B ) . Apparently, the approximation qualities are much better for the linear interpolates. This impression gets confirmed by the values of Δ p , q for this problem that are shown in Table 5. We can observe the following two behaviors: (i) the distances are much better for the continuous archives and the differences are even larger in objective space, and (ii) the distances decrease with decreasing q (which is in accordance to the result of Theorem 4).
In a next step, we consider discrete archives that have been generated from multi-objective evolutionary algorithms together with their resulting continuous archives. For multi-objective evolutionary algorithms, we have chosen the widely used methods NSGA-II [65] and MOEA/D [66]. We stress, however, that any other MOEA could be chosen and that the conclusions we draw out by our results apply in principle to any other such algorithm. Table 6 shows the parameter setting we have used for our studies.
Figure 18 and Figure 19 and Table 7 show the results of NSGA-II where we have used 500 generations and a population size of 12. Figure 20 and Figure 21 and Table 8 show the respective results for MOEA/D where we have also used 500 generations and population size 12. We can see that, for both algorithms, the Δ p , q values are significantly better for the continuous archives. We can also make another observation: the Δ p , q values oscillate for the results of the dominance-based algorithm NSGA-II which is indeed typical. For the continuous archives, these oscillations are less notorious, which indicates that the use of continuous archives may have a smoothing effect on the approximations, which is highly desired.
We want to investigate the last statement further on. To this end, we consider the following convex bi-objective problem: the objectives are given by f 1 , f 2 : R 3 R , where
f 1 ( x ) = ( x 1 + 1 ) 2 + x 2 2 + x 3 2 f 2 ( x ) = ( x 1 1 ) 2 + x 2 2 + x 3 2 .
Figure 22 shows the Pareto set and front of MOP (Equation (17)). The Pareto set is given by the straight segment joining ( 0 , 0 , 0 ) and ( 1 , 0 , 0 ) .
The values of Δ p , q obtained by NSGA-II for the discrete archives (using population size 20) as well as for the respective continuous archives can be seen in Figure 23 and Table 9. Also for this example, the values for the continuous archives are much better and the oscillations are significantly reduced compared to the discrete archives.
Figure 24, Figure 25 and Figure 26 show the results of both kinds of archives after 300, 400, and 500 generations which confirm this observation. The results show that NSGA-II is indeed capable of computing points near the Pareto front while the distribution of the points vary. This is a known fact since there exists no “limit archive” for this algorithm (as it is, e.g., not based on the averaged Hausdorff distance or any other performance indicator). When considering the respective results of the continuous archives, however, NSGA-II computes (at least visually) nearly perfect approximations of the Pareto front. The Δ p , q values reflect this.

8. Conclusions and Perspectives

In this paper, we have presented a comprehensive overview of the averaged Hausdorff distances that have recently appeared in connection with the study of MOPs.
Among the averaged Hausdorff distances studied here, the generalized Δ p , q as defined for arbitrary measurable sets was shown to provide a general and robust definition for applications that carries good metric properties, is adequate for use with continuous approximations of the Pareto set of a MOP, and even reduces to the previously introduced definition for discrete approximations.
Concerning the appearance of the additional parameter q in the definition of Δ p , q which could give the impression of an overly complicated expression, it is important to highlight, as it was observed in Remark 7, that it can provide the possibility to choose a suitable value of q in order to make GD p , q as Pareto compliant as possible for the MOP under consideration. This is an argument in favor of the flexibility provided by the generalized version GD p , q , which is not available for the GD p distance, and this particular aspect worths further investigation.
Nevertheless, since the freedom provided by the two parameters p and q may appear as excessive and perhaps undesirable in many applications, there remains to find a practical recipe to determine and fix these parameters according to the characteristics of the problem under consideration. Certainly, the desired spreads of the optimal archives, the distance of an approximation to the Pareto front, and the convexity of these fronts need to be taken into account in order to determine an appropriate set of preferred values for these parameters depending on the situation.
To achieve these aims, more theoretical as well as numerical studies of optimal solutions associated with Pareto fronts with different convexities must be carried out and experiments evaluating how the Pareto compliance can be enhanced in each situation by the choice of parameters need to be performed.
Finally, we stress that the results we have shown in Section 7 show the advantage of a new performance indicator that is able to compute the performance of a continuous approximation of the solution set. Continuous approximations, e.g., of the Pareto set/front of multi-objective optimization problems have not been considered so far, though both Pareto set and front typically form continuous sets in case the objectives are continuous. The examples have indicated that the consideration of continuous archives (via use of interpolation on the populations generated by the evolutionary algorithms) could allow a reduction in population sizes and, hence, a significant reduction of the computational effort of the evolutionary algorithms. This is because the time complexity for all existing multi-objective evolutionary algorithms is quadratic in the population size and in each generation of the algorithm. To verify this statement, more computations are needed, which is left for future work.

Author Contributions

J.M.B. and A.V. obtained the theoretical results concerning the ( p , q ) -averaged Hausdorff distance, and O.S. conceived and designed the experiments; J.M.B. and A.V. performed the experiments and provided the related figures and tables; O.S. analyzed the data and contributed with the text. J.M.B. and A.V. wrote the paper.

Funding

The first two authors were partially supported by Vicerrectoría de Investigación, Pontificia Universidad Javeriana, Bogotá D.C., Colombia. The third author was supported by Conacyt Basic Science project No. 285599 and SEP Cinvestav project no. 231.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Heinonen, J. Lectures on Analysis on Metric Spaces; Springer: New York, NY, USA, 2001. [Google Scholar]
  2. de Carvalho, F.; de Souza, R.; Chavent, M.; Lechevallier, Y. Adaptive Hausdorff distances and dynamic clustering of symbolic interval data. Pattern Recogn. Lett. 2006, 27, 167–179. [Google Scholar] [CrossRef]
  3. Huttenlocher, D.P.; Klanderman, G.A.; Rucklidge, W.A. Comparing images using the Hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 1993, 15, 850–863. [Google Scholar] [CrossRef]
  4. Yi, X.; Camps, O.I. Line-based recognition using a multidimensional Hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 1999, 21, 901–916. [Google Scholar]
  5. Falconer, K. Fractal Geometry: Mathematical Foundations and Applications, 2nd ed.; Mathematical foundations and applications; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2003. [Google Scholar]
  6. Aulbach, B.; Rasmussen, M.; Siegmund, S. Approximation of attractors of nonautonomous dynamical systems. Discrete Contin. Dyn. Syst. Ser. B 2005, 5, 215–238. [Google Scholar]
  7. Dellnitz, M.; Hohmann, A. A subdivision algorithm for the computation of unstable manifolds and global attractors. Numerische Mathematik 1997, 75, 293–317. [Google Scholar] [CrossRef]
  8. Emmerich, M.; Deutz, A.H. Test problems based on Lamé superspheres. In Proceedings of the 4th International Conference on Evolutionary Multi-criterion Optimization EMO’07, Matsushima, Japan, 5–8 March 2007; Springer: Berlin/Heidelberg, Germany, 2007; pp. 922–936. [Google Scholar]
  9. Dellnitz, M.; Schütze, O.; Hestermeyer, T. Covering Pareto sets by multilevel subdivision techniques. J. Optim. Theory Appl. 2005, 124, 113–155. [Google Scholar] [CrossRef]
  10. Dilettoso, E.; Rizzo, S.A.; Salerno, N. A weakly Pareto compliant quality indicator. Math. Comput. Appl. 2017, 22, 25. [Google Scholar] [CrossRef]
  11. Padberg, K. Numerical Analysis of Transport in Dynamical Systems. Ph.D. Thesis, University of Paderborn, Paderborn, Germany, 2005. [Google Scholar]
  12. Peitz, S.; Dellnitz, M. A survey of recent trends in multiobjective optimal control—Surrogate models, feedback control and objective reduction. Math. Comput. Appl. 2018, 23, 30. [Google Scholar] [CrossRef]
  13. Schütze, O. Set Oriented Methods for Global Optimization. Ph.D. Thesis, University of Paderborn, Paderborn, Germany, 2004. [Google Scholar]
  14. Schütze, O.; Coello Coello, C.A.; Mostaghim, S.; Talbi, E.G.; Dellnitz, M. Hybridizing evolutionary strategies with continuation methods for solving multi-objective problems. Eng. Optim. 2008, 40, 383–402. [Google Scholar] [CrossRef]
  15. Schütze, O.; Laumanns, M.; Coello Coello, C.A.; Dellnitz, M.; Talbi, E.G. Convergence of stochastic search algorithms to finite size Pareto set approximations. J. Glob. Optim. 2008, 41, 559–577. [Google Scholar] [CrossRef]
  16. Schütze, O.; Esquivel, X.; Lara, A.; Coello Coello, C.A. Using the averaged Hausdorff distance as a performance measure in evolutionary multiobjective optimization. IEEE Trans. Evol. Comput. 2012, 16, 504–522. [Google Scholar] [CrossRef]
  17. Vargas, A.; Bogoya, J.M. A generalization of the averaged Hausdorff distance. Computación y Sistemas 2018, 22, 331–345. [Google Scholar] [CrossRef]
  18. Bogoya, J.M.; Vargas, A.; Cuate, O.; Schütze, O. A (p,q)-averaged Hausdorff distance for arbitrary measurable sets. Math. Comput. Appl. 2018, 23, 51. [Google Scholar] [CrossRef]
  19. Cai, X.; Li, Y.; Fan, Z.; Zhang, Q. An external archive guided multiobjective evolutionary algorithm based on decomposition for combinatorial optimization. IEEE Trans. Evolut. Comput. 2015, 19, 508–523. [Google Scholar]
  20. Shang, R.; Wang, Y.; Wang, J.; Jiao, L.; Wang, S.; Qi, L. A multi-population cooperative coevolutionary algorithm for multi-objective capacitated arc routing problem. Inf. Sci. 2014, 277, 609–642. [Google Scholar] [CrossRef]
  21. Zhang, J.; Tang, Q.; Li, P.; Deng, D.; Chen, Y. A modified MOEA/D approach to the solution of multi-objective optimal power flow problem. Appl. Soft Comput. 2016, 47, 494–514. [Google Scholar] [CrossRef]
  22. Dhiman, G.; Kumar, V. Multi-objective spotted hyena optimizer: A Multi-objective optimization algorithm for engineering problems. Knowl. Based Syst. 2018, 150, 175–197. [Google Scholar] [CrossRef]
  23. López-Rubio, F.J.; López-Rubio, E. Features for stochastic approximation based foreground detection. Comput. Vision Image Underst. 2015, 133, 30–50. [Google Scholar] [CrossRef]
  24. Kerkhove, L.P.; Vanhoucke, M. Incentive contract design for projects: The owner’s perspective. Omega 2016, 62, 93–114. [Google Scholar] [CrossRef]
  25. Hansen, M.P.; Jaszkiewicz, A. Evaluating the Quality of Approximations to the Non-Dominated Set; IMM, Department of Mathematical Modelling, Technical University of Denmark: Kongens Lyngby, Denmark, 1998. [Google Scholar]
  26. Zitzler, E.; Thiele, L. Multiobjective evolutionary algorithms: A comparative case study and the strength Pareto approach. IEEE Trans. Evol. Comput. 1999, 3, 257–271. [Google Scholar] [CrossRef]
  27. Siwel, J.; Yew-Soon, O.; Jie, Z.; Liang, F. Consistencies and contradictions of performance metrics in multiobjective optimization. IEEE Trans. Evol. Comput. 2014, 44, 2329–2404. [Google Scholar]
  28. Vargas, A. On the Pareto compliance of the averaged Hausdorff distance as a performance indicator. Universitas Scientiarum 2018, 23, 333–354. [Google Scholar] [CrossRef]
  29. Miettinen, K. Nonlinear Multiobjective Optimization; Kluwer Academic Publishers: Tranbjerg, Denmark, 1999. [Google Scholar]
  30. Ehrgott, M.; Wiecek, M.M. Multiobjective programming. In Multiple Criteria Decision Analysis: State of the Art Surveys; Springer: New York, NY, USA, 2005; pp. 667–722. [Google Scholar]
  31. Pareto, V. Manual of Political Economy; The Macmillan Press: London, UK, 1971. [Google Scholar]
  32. Hillermeier, C. Nonlinear Multiobjective Optimization: A Generalized Homotopy Approach; Springer Science & Business Media: Berlin, Germany, 2001; Volume 135. [Google Scholar]
  33. Köppen, M.; Yoshida, K. Many-objective particle swarm optimization by gradual leader selection. In Proceedings of the 8th international conference on adaptive and natural computing algorithms (ICANNGA 2007), Warsaw, Poland, 11–14 April 2007; Springer: Berlin/Heidelberg, Germany, 2007; pp. 323–331. [Google Scholar]
  34. Schütze, O.; Lara, A.; Coello Coello, C.A. On the influence of the number of objectives on the hardness of a multiobjective optimization problem. IEEE Trans. Evolut. Comput. 2011, 15, 444–455. [Google Scholar] [CrossRef]
  35. Schaffer, J.D. Multiple Objective Optimization with Vector Evaluated Genetic Algorithms. Ph.D. Thesis, Vanderbilt University, Nashville, TN, USA, 1984. [Google Scholar]
  36. Amini, A.; Tavakkoli-Moghaddam, R. A bi-objective truck scheduling problem in a cross-docking center with probability of breakdown for trucks. Comput. Ind. Eng. 2016, 96, 180–191. [Google Scholar] [CrossRef]
  37. Li, M.W.; Hong, W.C.; Geng, J.; Wang, J. Berth and quay crane coordinated scheduling using multi-objective chaos cloud particle swarm optimization algorithm. Neural Comput. Appl. 2017, 28, 3163–3182. [Google Scholar] [CrossRef]
  38. Dulebenets, M.A. A comprehensive multi-objective optimization model for the vessel scheduling problem in liner shipping. Int. J. Prod. Econ. 2018, 196, 293–318. [Google Scholar] [CrossRef]
  39. Goodarzi, A.H.; Nahavandi, N.; Hessameddin, S. A multi-objective imperialist competitive algorithm for vehicle routing problem in cross-docking networks with time windows. J. Ind. Syst. Eng. 2018, 11, 1–23. [Google Scholar]
  40. Venturini, G.; Iris, C.; Kontovas, C.A.; Larsen, A. The multi-port berth allocation problem with speed optimization and emission considerations. Transp. Res.Part D Transp. Environ. 2017, 54, 142–159. [Google Scholar] [CrossRef]
  41. Chargui, T.; Bekrar, A.; Reghioui, M.; Trentesaux, D. Multi-objective sustainable truck scheduling in a rail-road physical internet cross-docking hub considering energy consumption. Sustainability 2019, 11, 3127. [Google Scholar] [CrossRef]
  42. Fliege, J.; Graña, L.M.; Svaiter, B.F. Newton’s method for multiobjective optimization. SIAM J. Opt. 2009, 20, 602–626. [Google Scholar] [CrossRef]
  43. Das, I.; Dennis, J. Normal-boundary intersection: A new method for generating the Pareto surface in nonlinear multicriteria optimization problems. SIAM J. Opt. 1998, 8, 631–657. [Google Scholar] [CrossRef]
  44. Eichfelder, G. Adaptive Scalarization Methods in Multiobjective Optimization; Springer: Berlin Heidelberg, Germany, 2008. [Google Scholar]
  45. Fliege, J. Gap-free computation of Pareto-points by quadratic scalarizations. Math. Methods Operat. Res. 2004, 59, 69–89. [Google Scholar] [CrossRef]
  46. Pereyra, V. Fast computation of equispaced Pareto manifolds and Pareto fronts for multiobjective optimization problems. Math. Comput. Simul. 2009, 79, 1935–1947. [Google Scholar] [CrossRef]
  47. Wang, H. Zigzag search for continuous multiobjective optimization. INFORMS J. Comp. 2013, 25, 654–665. [Google Scholar] [CrossRef]
  48. Martin, B.; Goldsztejn, A.; Granvilliers, L.; Jermann, C. Certified parallelotope continuation for one-manifolds. SIAM J. Numer. Anal. 2013, 51, 3373–3401. [Google Scholar] [CrossRef]
  49. Pereyra, V.; Saunders, M.; Castillo, J. Equispaced Pareto front construction for constrained bi-objective optimization. Math. Comput. Model 2013, 57, 2122–2131. [Google Scholar] [CrossRef]
  50. Martin, B.; Goldsztejn, A.; Granvilliers, L.; Jermann, C. On continuation methods for non-linear bi-objective optimization: Towards a certified interval-based approach. J. Glob. Optim. 2014, 64, 1–14. [Google Scholar] [CrossRef]
  51. Schütze, O.; Martín, A.; Lara, A.; Alvarado, S.; Salinas, E.; Coello Coello, C.A. The directed search method for multiobjective memetic algorithms. J. Comput. Optim. Appl. 2016, 63, 305–332. [Google Scholar] [CrossRef]
  52. Martín, A.; Schütze, O. Pareto Tracer: A predictor-corrector method for multi-objective optimization problems. Eng. Optim. 2018, 50, 516–536. [Google Scholar] [CrossRef]
  53. Jahn, J. Multiobjective search algorithm with subdivision technique. Comput. Optim. Appl. 2006, 35, 161–175. [Google Scholar] [CrossRef]
  54. Sun, J.Q.; Xiong, F.R.; Schütze, O.; Hernández, C. Cell Mapping Methods-Algorithmic Approaches and Applications; Springer: Singapore, 2019. [Google Scholar]
  55. Deb, K. Multi-Objective Optimization Using Evolutionary Algorithms; John Wiley & Sons: Chichester, UK, 2001. [Google Scholar]
  56. Coello Coello, C.A.; Lamont, G.B.; Van Veldhuizen, D.A. Evolutionary Algorithms for Solving Multi-Objective Problems, 2nd ed.; Springer: New York, NY, USA, 2007. [Google Scholar]
  57. Sun, Y.; Gao, Y.; Shi, X. Chaotic multi-objective particle swarm optimization algorithm incorporating clone immunity. Mathematics 2019, 7, 146. [Google Scholar] [CrossRef]
  58. Wang, P.; Xue, F.; Li, H.; Cui, Z.; Xie, L.; Chen, J. A multi-objective DV-hop localization algorithm based on NSGA-II in internet of things. Mathematics 2019, 7, 184. [Google Scholar] [CrossRef]
  59. Pei, Y.; Yu, J.; Takagi, H. Search acceleration of evolutionary multi-objective optimization using an estimated convergence point. Mathematics 2019, 7, 129. [Google Scholar] [CrossRef]
  60. Bullen, P.S. Handbook of Means and Their Inequalities; Vol. 560, Mathematics and its Applications; Kluwer Academic Publishers Group: Dordrecht, The Netherlands, 2003; p. xxviii+537. [Google Scholar]
  61. Van Veldhuizen, D.A.; Lamont, G.B. Multiobjective evolutionary algorithm test suites. In Proceedings of the 1999 ACM symposium on Applied Computing, San Antonio, TX, USA, 28 February–2 March 1999; ACM: New York, NY, USA, 1999; pp. 351–357. [Google Scholar]
  62. Coello Coello, C.A.; Cruz Cortés, N. Solving multiobjective optimization problems using an artificial immune system. Genet. Program. Evol. Mach. 2005, 6, 163–190. [Google Scholar] [CrossRef]
  63. Rudolph, G.; Schütze, O.; Grimme, C.; Domínguez-Medina, C.; Trautmann, H. Optimal averaged Hausdorff archives for bi-objective problems: Theoretical and numerical results. Comput. Optim. Appl. 2016, 64, 589–618. [Google Scholar] [CrossRef]
  64. Goldberg, M. Equivalence constants for p norms of matrices. Linear Multilinear Algebra 1987, 21, 173–179. [Google Scholar] [CrossRef]
  65. Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 2002, 6, 182–197. [Google Scholar] [CrossRef]
  66. Zhang, Q.; Li, H. MOEA/D: A multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 2007, 11, 712–731. [Google Scholar] [CrossRef]
Figure 1. Left: the objectives f 1 ( x ) = x 2 and f 2 ( x ) = ( x 2 ) 2 from a multi-objective optimization problem (MOP; Equation (1)). Right: the corresponding Pareto set over the interval [ 0 , 2 ] .
Figure 1. Left: the objectives f 1 ( x ) = x 2 and f 2 ( x ) = ( x 2 ) 2 from a multi-objective optimization problem (MOP; Equation (1)). Right: the corresponding Pareto set over the interval [ 0 , 2 ] .
Mathematics 07 00894 g001
Figure 2. According to Corollary 1, when acting on disjoint subsets, Δ p , q behaves as a proper metric if ( p , q ) lies in the blue sector, and according to Corollary 2, it behaves like an inframetric if ( p , q ) lies in the orange sectors.
Figure 2. According to Corollary 1, when acting on disjoint subsets, Δ p , q behaves as a proper metric if ( p , q ) lies in the blue sector, and according to Corollary 2, it behaves like an inframetric if ( p , q ) lies in the orange sectors.
Mathematics 07 00894 g002
Figure 3. Different scenarios where the GD p value of archive B is better (smaller) than the GD p value of archive A independently of the Pareto set and where the additional assumptions made in Theorem 5 are easily verifiable.
Figure 3. Different scenarios where the GD p value of archive B is better (smaller) than the GD p value of archive A independently of the Pareto set and where the additional assumptions made in Theorem 5 are easily verifiable.
Mathematics 07 00894 g003
Figure 4. Two situations where IGD p , q ( B ) is better (smaller) than IGD p , q ( A ) for sufficiently negative q: Here, the hypotheses of Proposition 3 hold true.
Figure 4. Two situations where IGD p , q ( B ) is better (smaller) than IGD p , q ( A ) for sufficiently negative q: Here, the hypotheses of Proposition 3 hold true.
Mathematics 07 00894 g004
Figure 5. Four examples where IGD p ( B ) is smaller (better) than IGD p , q ( A ) for sufficiently negative q: In each case, at least one of the requirements of Theorem 6 is satisfied.
Figure 5. Four examples where IGD p ( B ) is smaller (better) than IGD p , q ( A ) for sufficiently negative q: In each case, at least one of the requirements of Theorem 6 is satisfied.
Mathematics 07 00894 g005
Figure 6. (Left) A situation where the Pareto front F ( P ) and the images F ( X ) and F ( Y ) of continuous archives satisfy I p , q GD ( X ) I p , q GD ( Y ) and condition 1(a) of Theorem 8 holds true. (Right) A modification of the previous situation where conditions ( a ) and ( b ) of Remark 6 are satisfied but I p , q GD ( X ) I p , q GD ( Y ) . Here, there are no possible partitions of the archives satisfying part 1(a) of Theorem 8.
Figure 6. (Left) A situation where the Pareto front F ( P ) and the images F ( X ) and F ( Y ) of continuous archives satisfy I p , q GD ( X ) I p , q GD ( Y ) and condition 1(a) of Theorem 8 holds true. (Right) A modification of the previous situation where conditions ( a ) and ( b ) of Remark 6 are satisfied but I p , q GD ( X ) I p , q GD ( Y ) . Here, there are no possible partitions of the archives satisfying part 1(a) of Theorem 8.
Mathematics 07 00894 g006
Figure 7. A hypothetical Pareto front discretization P (black circles) and two different archives: X 1 (blue dots) and X 2 (orange squares).
Figure 7. A hypothetical Pareto front discretization P (black circles) and two different archives: X 1 (blue dots) and X 2 (orange squares).
Mathematics 07 00894 g007
Figure 8. Optimal Δ 1 , 1 archive A for the connected Pareto front P 1 given by Equation (12) with 10 elements (blue circles) and at the right is the respective archive coordinates and the Δ 1 , 1 distance.
Figure 8. Optimal Δ 1 , 1 archive A for the connected Pareto front P 1 given by Equation (12) with 10 elements (blue circles) and at the right is the respective archive coordinates and the Δ 1 , 1 distance.
Mathematics 07 00894 g008
Figure 9. Optimal Δ 1 , 1 archive A for the connected Pareto front P 2 given by Equation (12) with 10 elements (blue circles) and at the right is the respective archive coordinates and the Δ 1 , 1 distance.
Figure 9. Optimal Δ 1 , 1 archive A for the connected Pareto front P 2 given by Equation (12) with 10 elements (blue circles) and at the right is the respective archive coordinates and the Δ 1 , 1 distance.
Mathematics 07 00894 g009
Figure 10. Optimal Δ 1 , q five-point set archives A for the connected Pareto front P 1 given by Equation (12) with p = 1 and q = ± 1 / 2 .
Figure 10. Optimal Δ 1 , q five-point set archives A for the connected Pareto front P 1 given by Equation (12) with p = 1 and q = ± 1 / 2 .
Mathematics 07 00894 g010
Figure 11. Optimal Δ p , 1 one-point archives A for the connected Pareto front P 1 given by Equation (12) with q = 1 and different values of p: In all cases, the archives are located in the line x = y .
Figure 11. Optimal Δ p , 1 one-point archives A for the connected Pareto front P 1 given by Equation (12) with q = 1 and different values of p: In all cases, the archives are located in the line x = y .
Mathematics 07 00894 g011
Figure 12. Numerical optimal Δ 1 , 1 archive A for the disconnect step Pareto front P 3 ( 5 ) given by Equation (13) with 20 elements: here, we obtain Δ 1 , 1 A , P 3 ( 5 , 1 10 ) = 0.111132 .
Figure 12. Numerical optimal Δ 1 , 1 archive A for the disconnect step Pareto front P 3 ( 5 ) given by Equation (13) with 20 elements: here, we obtain Δ 1 , 1 A , P 3 ( 5 , 1 10 ) = 0.111132 .
Mathematics 07 00894 g012
Figure 13. The black horizontal segment is the set A from Equation (14), and the blue piecewise map is the respective approximation given by the set B δ from Equation (15) for two values of δ and ε = 0.10 .
Figure 13. The black horizontal segment is the set A from Equation (14), and the blue piecewise map is the respective approximation given by the set B δ from Equation (15) for two values of δ and ε = 0.10 .
Mathematics 07 00894 g013
Figure 14. (Left) Pareto set. (Right) Pareto front of MOP (Equation (16)) for n = 2 and γ = 2 .
Figure 14. (Left) Pareto set. (Right) Pareto front of MOP (Equation (16)) for n = 2 and γ = 2 .
Mathematics 07 00894 g014
Figure 15. The same as in Figure 14 but for γ = 1 / 2 .
Figure 15. The same as in Figure 14 but for γ = 1 / 2 .
Mathematics 07 00894 g015
Figure 16. (Left) The blue dots A and the blue polygonal line B are the discrete and continuous approximations, respectively, for the Pareto set which corresponds to the orange thick segment, of MOP (Equation (16)) for n = 2 . (Right) respective sets F ( A ) and F ( B ) of the Pareto front for γ = 2 .
Figure 16. (Left) The blue dots A and the blue polygonal line B are the discrete and continuous approximations, respectively, for the Pareto set which corresponds to the orange thick segment, of MOP (Equation (16)) for n = 2 . (Right) respective sets F ( A ) and F ( B ) of the Pareto front for γ = 2 .
Mathematics 07 00894 g016
Figure 17. The same as in Figure 16 but for γ = 1 / 2 .
Figure 17. The same as in Figure 16 but for γ = 1 / 2 .
Mathematics 07 00894 g017
Figure 18. (Left) the blue dots A and the blue polygonal line are the discrete and continuous approximations, respectively, for the Pareto set and the orange thick segment is for the 410th generation of the NSGA-II algorithm of MOP (Equation (16)) for n = 2 . (Right) corresponding sets F ( A ) and F ( B ) of the Pareto front for γ = 2 .
Figure 18. (Left) the blue dots A and the blue polygonal line are the discrete and continuous approximations, respectively, for the Pareto set and the orange thick segment is for the 410th generation of the NSGA-II algorithm of MOP (Equation (16)) for n = 2 . (Right) corresponding sets F ( A ) and F ( B ) of the Pareto front for γ = 2 .
Mathematics 07 00894 g018
Figure 19. The same as in Figure 18 but for γ = 1 / 2 .
Figure 19. The same as in Figure 18 but for γ = 1 / 2 .
Mathematics 07 00894 g019
Figure 20. (Left) the blue dots A and the blue polygonal line are the discrete and continuous approximations, respectively, for the Pareto set and the orange thick segment is for the 410th generation of the MOEA/D algorithm of MOP (Equation (17)) for n = 2 . (Right) corresponding sets F ( A ) and F ( B ) of the Pareto front for γ = 2 .
Figure 20. (Left) the blue dots A and the blue polygonal line are the discrete and continuous approximations, respectively, for the Pareto set and the orange thick segment is for the 410th generation of the MOEA/D algorithm of MOP (Equation (17)) for n = 2 . (Right) corresponding sets F ( A ) and F ( B ) of the Pareto front for γ = 2 .
Mathematics 07 00894 g020
Figure 21. The same as in Figure 20 but for γ = 1 / 2 .
Figure 21. The same as in Figure 20 but for γ = 1 / 2 .
Mathematics 07 00894 g021
Figure 22. (Left) Pareto set. (Right) Pareto front of MOP (Equation (17)).
Figure 22. (Left) Pareto set. (Right) Pareto front of MOP (Equation (17)).
Mathematics 07 00894 g022
Figure 23. The black curve is the Δ p , q value for the discrete approximation, and the blue one is the respective curve for the continuous approximation of NSGA-II for MOP (Equation (17)).
Figure 23. The black curve is the Δ p , q value for the discrete approximation, and the blue one is the respective curve for the continuous approximation of NSGA-II for MOP (Equation (17)).
Mathematics 07 00894 g023
Figure 24. (Left) The blue dots and the blue polygon line are the discrete and continuous approximation, respectively, for the Pareto set of MOP (Equation (17)) in the 300th generation. (Right) respective sets F ( A ) and F ( B ) of the Pareto front.
Figure 24. (Left) The blue dots and the blue polygon line are the discrete and continuous approximation, respectively, for the Pareto set of MOP (Equation (17)) in the 300th generation. (Right) respective sets F ( A ) and F ( B ) of the Pareto front.
Mathematics 07 00894 g024
Figure 25. The same as in Figure 24 but for the 400th generation.
Figure 25. The same as in Figure 24 but for the 400th generation.
Mathematics 07 00894 g025
Figure 26. The same as in Figure 24 but for the 500th generation.
Figure 26. The same as in Figure 24 but for the 500th generation.
Mathematics 07 00894 g026
Table 1. Δ p , q ( P , X 1 ) for several values of p and q.
Table 1. Δ p , q ( P , X 1 ) for several values of p and q.
p1251020
q
0.9091 2.7153 5.5714 7.0811 7.9831
100 0.9272 2.7701 5.6839 7.2241 8.1443
20 0.9537 2.8367 5.8202 7.3974 8.3396
5 0.9895 2.8624 5.8705 7.4613 8.4117
1 1.1131 2.8782 5.8848 7.4795 8.4322
1 1.3243 2.9112 5.8920 7.4886 8.4425
2 2.9277 2.9295 5.8956 7.4932 8.4476
5 5.8920 5.8956 5.9063 7.5068 8.4630
10 7.4886 7.4932 7.5068 7.5292 8.4882
Table 2. Δ p , q ( P , X 2 ) for several values of p and q.
Table 2. Δ p , q ( P , X 2 ) for several values of p and q.
p1251020
q
4.5412 4.5497 4.5751 4.6160 4.6867
100 4.6442 4.6529 4.6790 4.7209 4.7933
20 4.8425 4.8518 4.8795 4.9239 5.0003
5 4.9624 4.9720 5.0007 5.0465 5.1250
1 5.0008 5.0105 5.0394 5.0856 5.1646
1 5.0203 5.0301 5.0591 5.1055 5.1848
2 5.0301 5.0398 5.0690 5.1154 5.1949
5 5.0591 5.0690 5.0983 5.1450 5.2248
10 5.1055 5.1154 5.1450 5.1921 5.2725
Table 3. Triangle inequality violations, in percentage, for several values of p and q: Here, we randomly chose 80 sets, each one containing 2 points in [ 0 , 10 ] 2 , and verified the triangle inequality for all possible set permutations (that is, 492,960).
Table 3. Triangle inequality violations, in percentage, for several values of p and q: Here, we randomly chose 80 sets, each one containing 2 points in [ 0 , 10 ] 2 , and verified the triangle inequality for all possible set permutations (that is, 492,960).
p12510
q
1 0.05396 000
2 0.10265 0.00041 00
5 0.28815 0.01217 00
10 0.35622 0.05031 0.00041 0
20 0.43046 0.08439 0.00446 0.00041
Table 4. Δ p , q results between the sets A and B δ in Equations (14) and (15) for ε = 0.10 and some parameter values of p, q, and δ .
Table 4. Δ p , q results between the sets A and B δ in Equations (14) and (15) for ε = 0.10 and some parameter values of p, q, and δ .
pq Δ pq ( A , B 0.05 ) Δ pq ( A , B 0.10 ) Δ pq ( A , B 0.20 ) Δ pq ( A , B 0.40 )
11 0.7149 0.7464 0.8091 0.9324
11 0.4105 0.4506 0.5311 0.6945
1100 0.1503 0.1961 0.2878 0.4711
1200 0.1479 0.1934 0.2844 0.4663
1 10 , 000 0.1451 0.1901 0.2802 0.4602
Table 5. Δ p , q results for the approximations of the Pareto set and front for MOP (Equation (16)).
Table 5. Δ p , q results for the approximations of the Pareto set and front for MOP (Equation (16)).
pqDecision SpaceObjective Space
Finite Arch.Cont. Arch.Finite Arch.Cont. Arch.
γ = 2 11 0.5262 0.4775 0.4377 0.3851
11 0.2710 0.2017 0.2051 0.1070
1100 0.1121 0.0341 0.0862 0.0040
1200 0.1112 0.0333 0.0855 0.0039
110,000 0.1103 0.0324 0.0848 0.0038
γ = 1 2 11 0.5262 0.4775 0.5520 0.4965
11 0.2710 0.2017 0.2587 0.1120
1100 0.1121 0.0341 0.1079 0.0012
1200 0.1112 0.0333 0.1071 0.0012
110,000 0.1103 0.0324 0.1062 0.0011
Table 6. Parameter setting for NSGA-II and MOEA/D: Here, n denotes the dimension of the decision variable space.
Table 6. Parameter setting for NSGA-II and MOEA/D: Here, n denotes the dimension of the decision variable space.
Algorithm   ParameterValue
NSGA-II   Population size12
Number of generations500
Crossover probability0.8
Mutation probability 1 / n
Distribution index for crossover20
Distribution index for mutation20
MOEA/D   Population size12
# weight vectors12
Number of generations500
Crossover probability1
Mutation probability 1 / n
Distribution index for crossover30
Distribution index for mutation20
Aggregation functionTchebycheff
Neighborhood size3
Table 7. For MOP (Equation (16)), the Table shows the Δ p , q results for the finite and continuous Pareto front approximations. We used the NSGA-II generated archives for p = 1 and q = 10 .
Table 7. For MOP (Equation (16)), the Table shows the Δ p , q results for the finite and continuous Pareto front approximations. We used the NSGA-II generated archives for p = 1 and q = 10 .
Generation γ = 1 / 2 γ = 2
Finite Arch.Cont. Arch.Finite Arch.Cont. Arch.
50 0.0439 0.0147 0.0696 0.0160
100 0.0498 0.0109 0.0540 0.0102
200 0.0613 0.0118 0.0716 0.0207
250 0.0651 0.0265 0.0572 0.0061
400 0.0602 0.0102 0.0723 0.0276
450 0.0630 0.0154 0.0584 0.0088
460 0.0612 0.0154 0.0658 0.0098
470 0.0523 0.0102 0.0566 0.0083
480 0.0754 0.0269 0.0684 0.0241
490 0.0510 0.0091 0.0584 0.0118
500 0.0722 0.0097 0.0560 0.0103
Table 8. For MOP (Equation (16)), the Table shows the Δ p , q results for the finite and continuous Pareto front approximations. We used the MOEA/D generated archives for p = 1 and q = 10 .
Table 8. For MOP (Equation (16)), the Table shows the Δ p , q results for the finite and continuous Pareto front approximations. We used the MOEA/D generated archives for p = 1 and q = 10 .
Generation γ = 1 / 2 γ = 2
Finite Arch.Cont. Arch.Finite Arch.Cont. Arch.
50 0.0610 0.0171 0.0648 0.0119
100 0.0519 0.0051 0.1093 0.0016
200 0.0536 0.0037 0.0781 0.0009
250 0.0522 0.0037 0.0790 0.0008
400 0.0511 0.0017 0.0784 0.0009
450 0.0511 0.0017 0.0784 0.0009
460 0.0509 0.0012 0.0784 0.0009
470 0.0509 0.0012 0.0784 0.0009
480 0.0509 0.0010 0.0783 0.0009
490 0.0509 0.0010 0.0783 0.0009
500 0.0509 0.0010 0.0783 0.0009
Table 9. Δ p , q results between the Pareto Front and its respective discrete and continuous approximations of NSGA-II for MOP (Equation (17)): The data shown is the averaged over the 20 independent runs above.
Table 9. Δ p , q results between the Pareto Front and its respective discrete and continuous approximations of NSGA-II for MOP (Equation (17)): The data shown is the averaged over the 20 independent runs above.
GenerationContinuous ArchiveFinite Archive
20 0.1333 0.2401
40 0.0176 0.1451
60 0.0090 0.1561
80 0.0088 0.1355
100 0.0065 0.1472
120 0.0074 0.1412
140 0.0081 0.1395
160 0.0075 0.1549
180 0.0092 0.1468
200 0.0074 0.1429
220 0.0066 0.1408
240 0.0075 0.1397
260 0.0066 0.1460
280 0.0074 0.1439
300 0.0084 0.1421
320 0.0070 0.1352
340 0.0070 0.1373
360 0.0081 0.1454
380 0.0079 0.1413
400 0.0066 0.1388
420 0.0063 0.1400
440 0.0097 0.1384
460 0.0067 0.1418
480 0.0067 0.1421
500 0.0076 0.1426
Back to TopTop