The Averaged Hausdorff Distances in Multi-Objective Optimization: A Review

Bogoya, Johan M.; Vargas, Andrés; Schütze, Oliver

doi:10.3390/math7100894

Open AccessReview

The Averaged Hausdorff Distances in Multi-Objective Optimization: A Review

by

Johan M. Bogoya

¹,

Andrés Vargas

¹

and

Oliver Schütze

^2,3,*

¹

Departamento de Matemáticas, Pontificia Universidad Javeriana, Cra. 7 N. 40-62, Bogotá D.C. 111321, Colombia

²

Computer Science Department, CINVESTAV-IPN, Av. IPN 2508, Col. San Pedro Zacatenco, Mexico City 07360, Mexico

³

Dr. Rodolfo Quintero Ramirez Chair, UAM Cuajimalpa, Mexico City 05348, Mexico

^*

Author to whom correspondence should be addressed.

Mathematics 2019, 7(10), 894; https://doi.org/10.3390/math7100894

Submission received: 27 August 2019 / Revised: 10 September 2019 / Accepted: 17 September 2019 / Published: 24 September 2019

(This article belongs to the Special Issue Recent Trends in Multiobjective Optimization and Optimal Control)

Download

Browse Figures

Versions Notes

Abstract

A brief but comprehensive review of the averaged Hausdorff distances that have recently been introduced as quality indicators in multi-objective optimization problems (MOPs) is presented. First, we introduce all the necessary preliminaries, definitions, and known properties of these distances in order to provide a stat-of-the-art overview of their behavior from a theoretical point of view. The presentation treats separately the definitions of the

(p, q)

-distances

{GD}_{p, q}

,

{IGD}_{p, q}

, and

Δ_{p, q}

for finite sets and their generalization for arbitrary measurable sets that covers as an important example the case of continuous sets. Among the presented results, we highlight the rigorous consideration of metric properties of these definitions, including a proof of the triangle inequality for distances between disjoint subsets when

p, q ⩾ 1

, and the study of the behavior of associated indicators with respect to the notion of compliance to Pareto optimality. Illustration of these results in particular situations are also provided. Finally, we discuss a collection of examples and numerical results obtained for the discrete and continuous incarnations of these distances that allow for an evaluation of their usefulness in concrete situations and for some interesting conclusions at the end, justifying their use and further study.

Keywords:

Averaged Hausdorff distance; evolutionary multi-objective optimization; Pareto compliance; performance indicator; power means

1. Introduction

In many real-world applications, the problem of concurrent or simultaneous optimization of several objectives is an essential task known as a multi-objective optimization problem (MOP). One important problem in multi-objective optimization is to compute a suitable finite size approximation of the solution set of a given MOP, the so-called Pareto set and its image, the Pareto front.

The Hausdorff distance

d_{H}

(e.g., Reference [1]) measures how far two subsets of a metric space are from each other. Due to its properties, it is frequently used in many research areas such as computer vision [2,3,4], fractal geometry [5], the numerical computation of attractors in dynamical systems [6,7,8], or convergence of multi-objective algorithms to the Pareto set/front of a given multi-objective optimization problem [9,10,11,12,13,14,15]. One possible drawback of the classical Hausdorff distance, however, is that it punishes single outliers which leads to inequitable performance evaluations in some cases. As one example, we mention here multi-objective evolutionary algorithms. On the one hand, such algorithms are known to be very effective in the (global) approximation of the Pareto set/front. On the other hand, it is also known that the final approximations (populations) may contain some outliers (e.g., Reference [16]). For such cases, the Hausdorff distance may indicate a “bad” match of population and Pareto set/front, while the approximation quality may be indeed “good”. To avoid exactly this problem, Schütze et al. introduced the averaged Hausdorff distance

Δ_{p}

in Reference [16], but the initial definition only works for finite approximations of the solution set and does not behave as a proper metric in the formal mathematical sense. In Reference [17], the indicator

Δ_{p, q}

has been proposed by the first two authors of this paper.

Δ_{p, q}

is an averaged Hausdorff distance that fixes the metric behavior of

Δ_{p}

. Later, in Reference [18], a broader definition was given on metric measure spaces, suitable for the consideration of continuous approximations of the solution set. Moreover, this generalized indicator

Δ_{p, q}

preserves the nice metric properties of the initial finite case and reduces to it when using the standard discrete measure.

While the averaged Hausdorff distance has so far mostly been used for performance assessment of multi-objective evolutionary algorithms (using benchmark functions), it has also been used on MOPs coming from real-world problems including the multi-objective software next release problem [19], arc routing problems [20], power flow problems [21], engineering design problems [22], foreground detection [23], and contract design [24]. Several other indicators have also been proposed in the literature, like the hypervolume indicator or R indicators, each one with its own advantages and drawbacks, but their consideration is beyond the scope of this work. Information concerning other indicators can be found, e.g., in References [25,26,27].

The material reviewed in this work is based on recently published works [17,18,28]. The remainder of the document is organized as follows: in Section 2, we will briefly state the required background for MOPs and power means. In Section 3, we review the p-averaged Hausdorff distance

Δ_{p}

. In Section 4, we will discuss its generalization, the

(p, q)

-averaged Hausdorff distance

Δ_{p, q}

, explaining individually the finite and continuous cases. In Section 5, we will consider some aspects of the metric properties of

Δ_{p}

and

Δ_{p, q}

. In Section 6, we study the Pareto compliance of the performance indicators related to

Δ_{p}

and

Δ_{p, q}

. In Section 7, we will present some examples and numerical experiments. Finally, in Section 8, we will draw our conclusions and will discuss possible paths for future research in this direction.

2. Preliminaries

In this review, we introduce tools from a metric perspective that deal with two related contexts: distances between finite subsets of a metric space and distances between general measurable subsets of a metric measure space. The second context actually contains the first, but we deal separately with both of them, starting with the simpler setting of finite subsets before passing to the more general situation of arbitrary measurable sets that also contains the important special case of continuous sets. To emphasize each context, we use the convention that general sets will be denoted by X, Y, and Z, but when they are finite, the labels A, B, and C will be used.

2.1. Multi-Objective Optimization

First, we briefly present some basic aspects of multi-objective optimization problems (MOPs) required for the understanding of this paper. For a more thorough discussion, we refer the interested reader, e.g., to References [12,29,30].

A continuous MOP is rigorously formalized as the minimization of an appropriate function:

min_{x \in Q} F (x),

where F denotes a vector-valued function with components

f_{i} : Q \to R

, for

i = 1, \dots k

, called objective functions. Explicitly,

\begin{matrix} F : Q & ⟶ R^{k}, \\ x & ⟼ F (x) : = (f_{1} (x), \dots, f_{k} (x)) . \end{matrix}

The optimality of a candidate solution to a MOP depends on a dominance relation [31] given in terms of the partial order introduced below.

Definition 1.

For

x, y \in Q

, the partial Pareto ordering ⪯ associated with the MOP determined by F is defined as

x ⪯ y i f a n d o n l y i f f_{i} (x) ⩽ f_{i} (y), f o r a l l i = 1, \dots, k .

For

x, y, z \in Q

and

X, Y \subset Q

, the following notions of dominance (≺) and non-dominance (⊀) are standard in this context:

\begin{matrix} x is dominated by y, written y ≺ x if y ⪯ x and F (x) \neq F (y) . \\ z is dominated by X, written X ⪯ z if x ⪯ z for some x \in X, otherwise X ⋠ z . \\ X is dominated by Y, written Y ⪯ X if \forall x \in X \exists y \in Y such that y ⪯ x, otherwise Y ⋠ X . \end{matrix}

In addition,

x \in Q

is called a Pareto-optimal point if it is nondominated, i.e.,

∄ y \in Q

with

y ≺ x

. Finally, the Pareto set

P \subset Q

consists of all Pareto-optimal points and the Pareto front is defined as its image

F (P) \subset R^{k}

.

MOPs commonly possess the important characteristic that, when mild smoothness conditions are fulfilled, the solution (or Pareto) set P and its image the Pareto front

F (P) \subset R^{k}

consist of d dimensional subsets for

d = k - 1

(or even less) when the problem involves k objective functions ([32]).

As an example, let us describe a simple unconstrained MOP [33,34] given by

\begin{matrix} f_{1}, \dots, f_{k} & : R^{n} ⟶ R \\ f_{i} (x) & = \sum_{j = 1}^{n} {(x_{j} - a_{j}^{i})}^{2}, \end{matrix}

where

a^{i} = (a_{1}^{i}, \dots, a_{n}^{i}) \in R^{n}

and

i = 1, \dots, k

. The

a^{i}

’s correspond to the minimizers of each quadratic objective

f_{i}

, and the Pareto set of this problem consists of a

(k - 1)

simplex containing all the

a^{i}

’s as vertices, i.e.,

{simp}_{k - 1} : = simp (a^{1}, \dots, a^{k}) = \{\sum_{i = 1}^{k} μ_{i} a^{i} : μ_{1}, \dots, μ_{k} ⩾ 0 and \sum_{i = 1}^{k} μ_{i} = 1\} .

In the particular case when

n = 1

,

k = 2

,

a^{1} = 0

, and

a^{2} = 2

, the problem becomes

\begin{matrix} F & : R ⟶ R^{2} \\ F (x) & = (x^{2}, {(x - 2)}^{2}) . \end{matrix}

(1)

This is the so-called Schaffers problem [35]. Figure 1 illustrates the objectives

f_{1}

and

f_{2}

, and the Pareto front

F (P)

for this MOP. In this case, the Pareto set corresponds to

P = [0, 2]

and the Pareto front is a continuous convex curve in

R^{2}

joining

(0, 4)

with

(4, 0)

.

In many real-world applications, MOPs arise naturally. As one example, in almost all scheduling problems (e.g., References [36,37,38,39,40,41]), the total execution time (make-span) is of primary interest. However, the consideration of this objective is in many cases not enough since other quantities such as the tardiness or the energy consumption also play an important role and can consequently, according to the given problem, also add objectives to the resulting multi-objective problem.

For the numerical treatment of MOPs, there exist already many established approaches. For instance, there are mathematical programming techniques [29,42], point-wise iterative methods that are capable of detecting single local solutions of a given MOP. Via use of a clever sequence of these resulting scalar objective optimization problems, a suitable finite size approximation of the entire Pareto front can be computed in certain cases [43,44,45,46]. Multi-objective continuation methods take advantage of the fact that the Pareto set at least locally forms a manifold [47,48,49,50,51,52]. Starting with an initial (local) solution, further candidates are computed along the Pareto set of the given MOP. All of these methods typically yield high convergence rates but are, in turn, of local nature. A possible alternative is given by set oriented methods such as subdivision and cell mapping techniques [53,54] and evolutionary algorithms [55,56,57,58,59] that are of global nature and are capable of computing a finite size approximation of the Pareto front in one single run.

2.2. Finite Power Means

A comprehensive reference on the theory and properties of means is given in Reference [60], where proofs of the statements presented here for finite power means and for integral power means in the following subsection can be found (see also Reference [18] for integral means).

For a finite set

A \subset [0, \infty)

and a nonzero real p, the p-average or the p power mean of A is given by

\underset{a \in A}{M^{p}} (a) : = {(\frac{1}{| A |} \sum_{a \in A} a^{p})}^{\frac{1}{p}},

where

| A |

denotes the cardinality of A. The simpler notation

\underset{}{M^{p}} (A) : = \underset{a \in A}{M^{p}} (a)

will also be employed. Moreover, in order to simplify the forthcoming expressions, we introduce the abbreviation

\sum_{a \in A} a \frac{1}{| A |} \sum_{a \in A} a

to denote the arithmetic mean of the elements of a finite set

A \subset R_{⩾ 0}

.

It is well known that limit cases of power means recover familiar quantities, for example,

lim_{q \to 0} \underset{}{M^{p}} (A) = {(\prod_{a \in A} a)}^{\frac{1}{| A |}},

is the standard geometric mean of the elements of A. The special case

p = - 1

corresponds to the harmonic mean,

harm (A) : = \underset{}{M^{- 1}} (A) .

Moreover, the p-average of any finite set can also be defined for any p in the extended real line

\bar{R} : = [- \infty, \infty]

by taking appropriate limits.

lim_{q \to \infty} \underset{}{M^{p}} (A) = max (A), and lim_{q \to - \infty} \underset{}{M^{p}} (A) = min (A) .

Proposition 1.

Let A and B be finite subsets of

[0, \infty)

and

p, q \in \bar{R}

be arbitrary constants. Then, the following properties hold for finite power means:

1.: $\underset{}{M^{p}} (A) ⩽ \underset{}{M^{p}} (B)$ .
2.: For $p ⩽ q$ : $\underset{}{M^{p}} (A) ⩽ \underset{}{M^{q}} (A)$ .
3.: For a matrix of nonnegative elements $D = (d_{a, b})$ with $a \in A$ and $b \in B$ :

$\underset{}{M^{p}} (D) : = \underset{a \in A}{M^{p}} (\underset{b \in B}{M^{p}} (d_{a, b})) = \underset{b \in B}{M^{p}} (\underset{a \in A}{M^{p}} (d_{a, b})) .$
4.: For $p ⩾ 1$ : $\underset{}{M^{p}} ({a + b ∣ a \in A, b \in B}) ⩽ \underset{}{M^{p}} (A) + \underset{}{M^{p}} (B) .$
5.: For the harmonic mean: $harm (A) ⩽ | A | min (A)$ .

2.3. Integral Power Means in Measure Spaces

In order to present this part with sufficient generality, let us denote by

(S, μ)

a measure space. Let

M (S)

be the

σ

algebra of measurable subsets of

S

and

M_{< \infty} (S)

be the collection of those subsets with finite measure.

Now, we recall some fundamental properties of integral power means in this setting needed for the forthcoming sections. For

p \in R \ {0}

and a measurable function

f : X \subset S \to [0, \infty)

defined on a subset

X \in M_{< \infty} (S)

, the p power mean or p-average of f over X is given by

\underset{x \in X}{M^{p}} (f (x)) : = {(\frac{1}{μ (X)} \int_{X} f {(x)}^{p} d μ)}^{\frac{1}{p}} .

(2)

For convenience, rhs of Equation (2) will be denoted simply as

- \int_{X} f^{p} d μ \frac{1}{| X |} \int_{X} f {(x)}^{p} d μ,

where

| X | : = μ (X)

refers in this context to the measure of X and not to its cardinality as in the finite case. For brevity, when the measure

μ

employed is clear,

d μ

will be abbreviated by

d x

to highlight the variable being integrated. The shorthand

\underset{}{M^{p}} (f (X)) : = \underset{x \in X}{M^{p}} (f (x))

will also be employed.

For

p ⩾ 1

, the integral p mean corresponds to

\underset{}{M^{p}} (f (X)) = {| X |}^{- \frac{1}{p}} {∥ f ∥}_{p}

, where

{∥ \cdot ∥}_{p}

is the standard p norm of the Lebesgue space

L^{p} (X, μ)

. The cases

p = \pm \infty

can also be included by taking the limits

p \to \pm \infty

. In fact, since the essential supremum of the function f on X is

{∥ f ∥}_{\infty} = {ess sup}_{x \in X} f (x)

, and when f is not identically zero its essential infimum is precisely

{∥ 1 / f ∥}_{\infty}^{- 1} = {ess inf}_{x \in X} f (x)

; by calculating the limits, we obtain that

\begin{matrix} \underset{x \in X}{M^{\infty}} (f (x)) : = & lim_{p \to \infty} {(- \int_{X} f^{p} d μ)}^{\frac{1}{p}} = {∥ f ∥}_{\infty}, \end{matrix}

and similarly,

\begin{matrix} \underset{x \in X}{M^{- \infty}} (f (x)) : = & lim_{p \to - \infty} {(- \int_{X} f^{p} d μ)}^{\frac{1}{p}} = ∥ \frac{1}{f} ∥_{\infty}^{- 1} . \end{matrix}

Note that

{∥ \cdot ∥}_{\infty}

corresponds to the norm of the space

L^{\infty} (X, μ)

. For

p = 0

, it is possible to define

\underset{}{M^{p}}

as the integral generalization of the notion of geometric mean, and it is given explicitly by

\underset{x \in X}{M^{0}} (f (x)) : = exp (- \int_{X} log f d μ) .

Proposition 2.

For subsets

X, Y \in M_{< \infty} (R^{k})

, nonnegative measurable functions

f, g : X \to [0, \infty)

, and any product-measurable function

d : X \times Y \to [0, \infty)

, the integral power mean

\underset{}{M^{p}}

satisfies that

1.: For $p \in \bar{R}$ , $k \in [0, \infty)$ : $\underset{x \in X}{M^{p}} (k) = k$ and $\underset{x \in X}{M^{p}} (k f (x)) = k \underset{x \in X}{M^{p}} (f (x))$ .
2.: For $p \in \bar{R}$ : $\underset{x \in X}{M^{p}} (\underset{y \in Y}{M^{p}} (d (x, y))) = \underset{y \in Y}{M^{p}} (\underset{x \in X}{M^{p}} (d (x, y)))$ .
3.: For $p \in [1, \infty]$ : $\underset{x \in X}{M^{p}} (f (x) + g (x)) ⩽ \underset{x \in X}{M^{p}} (f (x)) + \underset{x \in X}{M^{p}} (g (x))$ .
4.: For $p \in \bar{R}$ and $f ⩽ g$ : $\underset{x \in X}{M^{p}} (f (x)) ⩽ \underset{x \in X}{M^{p}} (g (x))$ .
5.: For $p, q \in [0, \infty]$ with $p ⩽ q$ : $\underset{x \in X}{M^{p}} (f (x)) ⩽ \underset{x \in X}{M^{q}} (f (x))$ .

3. The p-Averaged Hausdorff Distance

When trying to measure the distance between subsets of Euclidean space or even an arbitrary metric space, a natural choice is the well-known Hausdorff distance

d_{H}

that is extensively employed in many different contexts. However, its use is of limited practical value to measure the distance to the Pareto set/front in typical MOPs, such as stochastic search methods implemented by an evolutionary algorithm. This is due to the fact that these algorithms may produce a set of outliers that can be heavily punished by

d_{H}

. As a partial remedy, the use of an averaged Hausdorff distance

Δ_{p}

was first proposed in Reference [16] to replace

d_{H}

.

Let

d : S \times S \to [0, \infty)

denote a distance function on a metric space

S

for which the standard properties of the identity of indiscernibles, nonnegativity, symmetry, and subadditivity (more commonly known as the triangle inequality) are satisfied.

Definition 2.

Given a point

x_{0} \in S

and subsets

X, Y \subset S

, we have

1.: A pointwise distance to sets: $d (x_{0}, X) : = inf {d (x_{0}, x) ∣ x \in X}$ .
2.: A pre-distance between sets: $d (Y, X) : = sup {d (y, X) ∣ y \in Y}$ .
3.: The Hausdorff distance between sets: $d_{H} (X, Y) : = max {d (X, Y), d (Y, X)}$ .

For simplicity, throughout the text, the metric d can be assumed to be the standard Euclidean distance

d (x, y) : = ∥ x - y ∥

induced on some

S \subset R^{k}

by the Euclidean 2 norm of

R^{k}

, but the theory carries over to any general metric space

(S, d)

.

Definition 3.

Let

p \in N

. For finite subsets

A, B \subset S

, their (modified) p generational distance is

{GD}_{p} (A, B) : = {(\frac{1}{| A |} \sum_{a \in A} d {(a, B)}^{p})}^{\frac{1}{p}},

and their (modified) p inverted generational distance is

{IGD}_{p} (A, B) : = {(\frac{1}{| B |} \sum_{b \in B} d {(b, A)}^{p})}^{\frac{1}{p}} .

From them, the p averaged Hausdorff distance is obtained by taking the maximum

Δ_{p} (A, B) : = max {{GD}_{p} (A, B), {IGD}_{p} (A, B)} .

The indicators

{GD}_{p}

and

{IGD}_{p}

in Definition 3 correspond to simple adjustments to the definitions of the generational distance [61] and the inverted generational distance [62].

The standard Hausdorff distance is recoverable from

Δ_{p}

by taking the limit

{lim}_{p \to \infty} Δ_{p} = d_{H}

, but for any finite value of p, the distance

Δ_{p}

is obtained from standard p power means of all the distances employed to calculate the supremum in part 2 of Definition 2, which is needed to define

d_{H}

.

The advantage of using

Δ_{p}

as an indicator is that it does not immediately disqualify a few outliers in a candidate set, contrary to what

d_{H}

does and that, among the possible configurations of (finite) candidate solutions to a MOP, it assigns lesser distances to the Pareto front to those solutions appearing evenly spread along its whole domain (see, e.g., Reference [63]). The behavior of

Δ_{p}

as a quality indicator is studied, e.g., in References [16,28], and it corresponds to the particular case

q \to - \infty

of the results for general

(p, q)

-indicators presented in Section 6.

Concerning its metric properties,

Δ_{p}

has the drawback of not being a proper metric in the usual sense because for any non-unit set

A \subset S

the distance

Δ_{p} (A, A) > 0

. This problem will be fixed in the following section with a simple modification. Nevertheless, independently from that, for a positive number p, the distance

Δ_{p}

does not satisfy the triangle inequality but only a weaker version of it. Indeed, as a consequence of Corollary 3, we have that

Δ_{p} (A, C) ⩽ N^{α} (Δ_{p} (A, C) + Δ_{p} (B, C)),

where

N = max {| A |, | B |, | C |} ⩾ 1

and

α = 1 / p

.

For further details concerning

Δ_{p}

, its properties, and its relation to other indicators, the reader can consult, e.g., References [16,63].

4. The (p,q)-Averaged Hausdorff Distance

To better evaluate the optimality of a certain candidate set to approximate the Pareto set/front of a MOP, several generalizations of the averaged Hausdorff distance

Δ_{p}

have been recently introduced.

4.1. (p,q)-Distances between Finite Sets

Definition 4.

For

p, q \in R \ {0}

, the generational

(p, q)

-distance

{GD}_{p, q} (A, B)

between two finite subsets

A, B \subset S

is given by

{GD}_{p, q} (A, B) : = {(\sum_{a \in A} {(\sum_{b \in B} d {(a, b)}^{q})}^{\frac{p}{q}})}^{\frac{1}{p}} .

The distance

{GD}_{p, q} (A, B)

can be extended for values of

p = 0

or

q = 0

, by taking the limits

p \to 0

or

q \to 0

, respectively. In such cases, properties of finite power means suggest the following definitions:

\begin{matrix} {GD}_{p, 0} (A, B) : = & {(\sum_{a \in A} {(\prod_{b \in B} d (a, b))}^{\frac{p}{| B |}})}^{\frac{1}{p}}, when p \neq 0, \\ {GD}_{0, q} (A, B) : = & {(\prod_{a \in A} {(\sum_{b \in B} d {(a, b)}^{q})}^{\frac{1}{q}})}^{\frac{1}{| A |}}, when q \neq 0, and \\ {GD}_{0, 0} (A, B) : = & {(\prod_{a \in A} {(\prod_{b \in B} d (a, b))}^{\frac{1}{| B |}})}^{\frac{1}{| A |}} if p = q = 0 . \end{matrix}

We can also calculate

{GD}_{p, q}

when

p \to \pm \infty

or

q \to \pm \infty

by changing the corresponding sum with a minimum or a maximum according to the case. In particular, we have the nice relation

lim_{q \to - \infty} {GD}_{p, q} (A, B) = {GD}_{p} (A, B) .

(3)

Note that the definition of

{GD}_{p, q}

has two drawbacks, namely

{GD}_{p, q} (A, B)

does not necessarily vanish if

A = B

and in general

{GD}_{p, q} (A, B) \neq {GD}_{p, q} (B, A)

, hence it does not define a proper metric. In order to get one, a slight modification is needed.

Definition 5.

Let

p, q \in R \ {0}

. For finite subsets

A, B \subset S

, their

(p, q)

-averaged Hausdorff distance is

Δ_{p, q} (A, B) max {{GD}_{p, q} (A, B \ A), {GD}_{p, q} (B, A \ B)} .

Notice that

{GD}_{p} (A, B) = {GD}_{p} (A, B \ A)

when

A \cap B = ⌀

, thus using Equation (3) and Definition 5, we easily obtain

lim_{q \to - \infty} Δ_{p, q} (A, B) = Δ_{p} (A, B) .

In this way, for finite and disjoint sets, the indicator

Δ_{p, q}

is a generalization of

Δ_{p}

. Similarly to the relation

{GD}_{p} (A, B) = {| A |}^{- \frac{1}{p}} {∥ D_{A B} ∥}_{p},

between the

{GD}_{p} (A, B)

and the matrix

ℓ_{p}

norm

∥ D_{A B} ∥_{p}

of the distance matrix

D_{A B} : = {[d (a, b)]}_{a, b}

for

a \in A

and

b \in B

, we also have the following relation between the

(p, q)

-generational distance

{GD}_{p, q} (A, B)

and the matrix

ℓ_{p, q}

norm

∥ D_{A B} ∥_{p, q}

, where the definition of the latter is precisely that of

{GD}_{p, q}

but replacing all the normalized sums

\sum

by standard ones ∑ (see, e.g., Reference [64]):

\begin{matrix} {GD}_{p, q} (A, B) & = \underset{a \in A}{M^{p}} (\underset{b \in B}{M^{q}} (d (a, b))) = {| A |}^{- \frac{1}{p}} {| B |}^{- \frac{1}{q}} {∥ D_{A B} ∥}_{p, q} . \end{matrix}

A useful property of the distance

Δ_{p, q}

is that the parameters can be adjusted independently to achieve some desired spread of the archives by choosing an appropriate q and that they can be located with custom closeness to the Pareto front of a MOP by an adequate choice of p.

4.2. $(p, q)$ -Distances between Measurable Sets

With the aid of Proposition 2, the results of the previous section can be generalized to subsets of a metric space

(S, d)

endowed with an appropriate measure

μ

. For concreteness,

S

can be taken to be a subset of

R^{k}

carrying the metric induced from the Euclidean metric of

R^{k}

and endowed with an appropriate non-null measure

μ

. Notice that, in our intended applications,

μ

will not be the restriction of the standard Lebesgue measure of

R^{k}

to

S

for the simple reason that it can easily vanish as it happens on any hypersurface or lower dimensional subsets of

R^{k}

. In this case, a lower dimensional measure is needed and alternatives like the Hausdorff measure on

S

can be used, since it gives rise to the standard notion of d dimensional volume for d submanifolds of

R^{k}

. When these submanifolds are parametrized by functions from subsets of

R^{d}

, the same volume will be obtained by a change of variable formulae from the standard Lebesgue measure on those subsets of

R^{d}

.

A very important observation in this context is that any set-theoretic relation obtained from measure-related calculations needs to be understood to hold almost everywhere (a.e.). Therefore, for

X, Y \in M_{< \infty} (S)

, the statements

X = Y

or

X \subset Y

mean that the relations hold a.e., i.e.,

μ {X \neq Y} = 0

or

μ {X ⊈ Y} = 0

, respectively. In other words, in this setting, we will always identify

X \in M_{< \infty} (S)

with its equivalence class

[X] : = {Y ∣ X = Y, a . e .}

. This means that those classes will be regarded as the elements of

M_{< \infty} (S)

, removing the need to carry the abbreviation a.e. all the time. Henceforth, to simplify complicated formulae,

d (x, y)

will be shortened to

d_{x, y}

.

Definition 6.

Let

p, q \in R \ {0}

. For finite-measure subsets

X, Y \in M_{< \infty} (S)

, their generational

(p, q)

-distance is given by

{GD}_{p, q} (X, Y) \underset{x \in X}{M^{p}} (\underset{y \in Y}{M^{q}} (d_{x, y})) = {(- \int_{X} {(- \int_{Y} d_{x, y}^{q} d y)}^{\frac{p}{q}} d x)}^{\frac{1}{p}} .

The cases

p < 0

or

q < 0

are well defined only if X and Y are disjoint subsets.

Similarly to the finite case,

{GD}_{p, q}

can be extended to values of

p, q \in \bar{R}

, but there are two drawbacks:

{GD}_{p, q} (X, X) = 0

only if X is a unit-set or singleton, and

{GD}_{p, q} (X, Y)

can differ from

{GD}_{p, q} (Y, X)

. To fix this undesirable behavior, we repeat the strategy used in the finite case as follows.

Definition 7.

Let

p, q \in R \ {0}

. For finite-measure subsets

X, Y \in M_{< \infty} (S)

, their

(p, q)

-averaged Hausdorff distance is given by

Δ_{p, q} (X, Y) max {{GD}_{p, q} (X, Y \ X), {GD}_{p, q} (Y, X \ Y)} .

Remark 1.

In general, the

(p, q)

-distances are maps:

M_{< \infty} (S) \times M_{< \infty} (S) \to [0, \infty)

. On the collection of finite subsets of

S

, the standard counting measure can be taken as the underlying one needed for these measure-theoretic notions of

{GD}_{p, q}

and

Δ_{p, q}

, and in this case, these distances become precisely the finite-case distances given in Definitions 4 and 5.

Remark 2.

For disjoint subsets X and Y, Definition 5 in the finite case and Definition 7 above in the measurable case reduce to the simpler form

Δ_{p, q} (X, Y) : = max {{GD}_{p, q} (X, Y), {GD}_{p, q} (Y, X)},

which is the one we will actually use in most situations. The more general definition for non-disjoint subsets is given with the purpose that the distance so-defined changes continuously as one set approaches the other until their distance vanishes. In other words, the general definition allows the distance to become a continuous function with respect to the metric topology that it determines. Nevertheless, for practical purposes dealing with applications and for most of the results presented below, the simpler definition between disjoint subsets suffices.

5. Metric Properties

To explain some of the terminology used in this section, we recall to the reader that the standard triangle inequality for a distance function

d : S \times S \to [0, \infty)

is usually weakened in two different but related ways by postulating the existence of a constant

C > 0

such that, for any points

x, y, z \in S

, one of the following conditions hold:

The C relaxed triangle inequality: $d (x, z) ⩽ C (d (x, y) + d (y, z))$ .
The C inframetric inequality: $d (x, z) ⩽ C max {d (x, y), d (y, z)}$ .

Since the second condition implies the first one by using the very same constant

C > 0

and, reciprocally, the C relaxed triangle inequality implies the

2 C

inframetric one, both conditions are equivalent for an appropriate choice of constants. A semimetric satisfying any one of these conditions will be simply called an inframetric.

For arbitrary measurable sets in

S

, the following results summarize the metric properties of

{GD}_{p, q}

and

Δ_{p, q}

. Using the counting measure, these properties also apply to finite sets. For more details, see Reference [17] in the finite case and Reference [18] in the generalized measure-theoretic context.

Theorem 1.

For

p, q \in [1, \infty]

, the generational

(p, q)

-distance

{GD}_{p, q}

is subadditive in

M_{< \infty} (S)

, i.e., for any

X, Y, Z \in M_{< \infty} (S)

, the triangle inequality holds true:

{GD}_{p, q} (X, Z) ⩽ {GD}_{p, q} (X, Y) + {GD}_{p, q} (Y, Z) .

Proof.

The proof follows easily by simple steps using the properties in Proposition 2. We start from the standard triangle inequality for

d (\cdot, \cdot)

:

d_{x, z} ⩽ d_{x, y} + d_{y, z} (x \in X, y \in Y, z \in Z),

taking at both sides the q-average over Z and using 1–3 of Proposition 2 to arrive at

\begin{matrix} \underset{z \in Z}{M^{q}} (d_{x, z}) ⩽ \underset{z \in Z}{M^{q}} (d_{x, y} + d_{y, z}) ⩽ \underset{z \in Z}{M^{q}} (d_{x, y}) + \underset{z \in Z}{M^{q}} (d_{y, z}) = d_{x, y} + \underset{z \in Z}{M^{q}} (d_{y, z}) . \end{matrix}

(4)

Now, there are two independent cases for the parameters

p, q \in [1, \infty)

. We explain here only the case

p ⩽ q

, but the case

q < p

follows by similar arguments; see Thm. 2 in Reference [18]. Calculating the p-average over X at both sides of Equation (4) and using 1, 3, and 5 of Proposition 2, we get

\underset{x \in X}{M^{p}} (\underset{z \in Z}{M^{q}} (d_{x, z})) ⩽ \underset{x \in X}{M^{p}} (d_{x, y} + \underset{z \in Z}{M^{q}} (d_{y, z})) = \underset{x \in X}{M^{p}} (d_{x, y}) + \underset{z \in Z}{M^{q}} (d_{y, z}) .

(5)

Since the lhs of Equation (5) is

{GD}_{p, q} (X, Z)

, after a further p-average over Y at both sides of Equation (5) and parts 1, 3, and 5 of Proposition 2, we obtain

\begin{matrix} {GD}_{p, q} (X, Z) & ⩽ \underset{y \in Y}{M^{p}} (\underset{x \in X}{M^{p}} (d_{x, y}) + \underset{z \in Z}{M^{q}} (d_{y, z})) = \underset{y \in Y}{M^{p}} (\underset{x \in X}{M^{p}} (d_{x, y})) + {GD}_{p, q} (Y, Z) . \end{matrix}

But from 2, 4, and 5 of Proposition 2, the first term at the rhs above satisfies

\underset{y \in Y}{M^{p}} (\underset{x \in X}{M^{p}} (d_{x, y})) = \underset{x \in X}{M^{p}} (\underset{y \in Y}{M^{p}} (d_{x, y})) ⩽ \underset{x \in X}{M^{p}} (\underset{y \in Y}{M^{q}} (d_{x, y})) = {GD}_{p, q} (X, Y) . □

Corollary 1.

If

p, q \in R \ {0}

, the

(p, q)

-averaged Hausdorff distance

Δ_{p, q}

is a semimetric on the space

M_{< \infty} (S)

of finite-measure subsets of

S

. Furthermore, if

p, q \in [1, \infty)

the distance

Δ_{p, q}

behaves as a proper metric when it is restricted to disjoint subsets of

M_{< \infty} (S)

.

Proof.

From Definition 7, we obtain the relations

Δ_{p, q} (\cdot, \cdot) ⩾ 0

and

Δ_{p, q} (X, Y) = Δ_{p, q} (Y, X)

for any

X, Y \in M_{< \infty} (S)

and all

p, q \in R \ {0}

. Moreover, from Definition 6, it follows that

{GD}_{p, q} (X, Y \ X) = 0

if and only if

X = ⌀

or

Y \subseteq X

(hence,

Y \ X = ⌀

). Therefore, for

X, Y \neq ⌀

,

Δ_{p, q} (X, Y) = 0 ⟺ X = Y,

i.e.,

Δ_{p, q}

is a semimetric on the collection of finite-measure subsets

M_{< \infty} (S)

. Finally, for disjoint X and Y, it is clear that

{GD}_{p, q} (X, Y \ X) = G D_{p, q} (X, Y)

; thus, by Theorem 1, the triangle inequality holds for both arguments inside the maximum that defines

Δ_{p, q}

when

p, q \in [1, \infty)

. This implies that the triangle inequality is also valid for

Δ_{p, q}

. □

Theorem 2.

Let

X, Y, Z \in M_{< \infty} (S)

be subsets admitting positive constants

r < R

such that

r ⩽ d_{u, v} ⩽ R

for any

u \in X \cup Y

and

v \in Y \cup Z

. Then, for all

p, q \in R \ {0}

,

| p |, | q | ⩾ 1

and at least one of them negative a relaxed triangle inequality holds for

{GD}_{p, q}

, namely

{GD}_{p, q} (X, Z) ⩽ \frac{R^{2}}{r^{2}} ({GD}_{p, q} (X, Y) + {GD}_{p, q} (Y, Z)) .

Proof.

Step 1: Let

p \in R \ {0}

, and suppose that

q < 0

. We will prove that

{GD}_{p, | q |} (X, Y) ⩽ \frac{R}{r} {GD}_{p, q} (X, Y) .

(6)

For all

x \in X

and

y, z \in Y

, we have

\frac{r}{R} ⩽ \frac{d_{x, y}}{d_{x, z}} ⩽ \frac{R}{r}

. Thus,

\frac{R}{r} ⩾ {(- \int_{Y} - \int_{Y} {[\frac{d_{x, y}}{d_{x, z}}]}^{| q |} d y d z)}^{\frac{1}{| q |}} = {(- \int_{Y} d_{x, y}^{| q |} d y)}^{\frac{1}{| q |}} {(- \int_{Y} d_{x, z}^{- | q |} d z)}^{\frac{1}{| q |}} .

Since

q = - | q |

, this means that

\underset{y \in Y}{M^{| q |}} (d_{x, y}) ⩽ \frac{R}{r} \underset{y \in Y}{M^{q}} (d_{x, y})

. Taking the p-average

\underset{x \in X}{M^{p}}

at both sides and from 1 and 4 of Proposition 2, we find

\underset{x \in X}{M^{p}} (\underset{y \in Y}{M^{| q |}} (d_{x, y})) ⩽ \frac{R}{r} \underset{x \in X}{M^{p}} (\underset{y \in Y}{M^{q}} (d_{x, y}))

, which is exactly Equation (6).

Step 2: Now, for

q \in R \ {0}

and

p < 0

, we will prove that

{GD}_{| p |, q} (X, Y) ⩽ \frac{R}{r} {GD}_{p, q} (X, Y) .

(7)

By assumption, we have

\frac{r}{R} ⩽ \frac{d_{x, y}}{d_{u, y}} ⩽ \frac{R}{r}

for any

y \in Y

and all

x, u \in X

. Similarly as before and using 1 and 4 of Proposition 2, we conclude from the rhs part that

\underset{y \in Y}{M^{q}} (d_{x, y}) ⩽ \frac{R}{r} \underset{y \in Y}{M^{q}} (d_{u, y})

. However, since

p = - | p |

, after taking a p-average of the quotient of means, it follows from

\begin{matrix} {({- \int}_{X} {(\underset{y \in Y}{M^{q}} (d_{x, y}))}^{| p |} d x)}^{\frac{1}{| p |}} {({- \int}_{X} {(\underset{y \in Y}{M^{q}} (d_{u, y}))}^{p} d u)}^{\frac{1}{| p |}} = {({- \int}_{X} {- \int}_{X} {[\frac{\underset{y \in Y}{M^{q}} (d_{x, y})}{\underset{y \in Y}{M^{q}} (d_{u, y})}]}^{| p |} d x d u)}^{\frac{1}{| p |}} ⩽ \frac{R}{r}, \end{matrix}

that

\underset{x \in X}{M^{| p |}} (\underset{y \in Y}{M^{q}} (d_{x, y})) ⩽ \frac{R}{r} \underset{x \in X}{M^{p}} (\underset{y \in Y}{M^{q}} (d_{x, y}))

, which is now (7).

Step 3: The previous steps can be summarized in the expression

{GD}_{| p |, | q |} (X, Y) ⩽ \frac{R}{r} {GD}_{| p |, q} (X, Y) ⩽ \frac{R^{2}}{r^{2}} {GD}_{p, q} (X, Y) .

(8)

Using again 4 of Proposition 2 and Definition 6, we get

{GD}_{p, q} (X, Z) ⩽ {GD}_{| p |, | q |} (X, Z)

. From this, the subadditivity for

{GD}_{| p |, | q |}

(Theorem 1), and Equation (8), we conclude

\begin{matrix} {GD}_{p, q} (X, Z) & ⩽ {GD}_{| p |, | q |} (X, Y) + {GD}_{| p |, | q |} (Y, Z) ⩽ \frac{R^{2}}{r^{2}} ({GD}_{p, q} (X, Y) + {GD}_{p, q} (Y, Z)) . \end{matrix}

□

Remark 3.

For parameters

(p, q) \in R^{2}

that lie in the orange or blue sectors in Figure 2, the distance

{GD}_{p, q}

fulfills a C relaxed triangle inequality for a constant

C = R^{2} / r^{2}

only if the condition

r ⩽ d_{u, v} ⩽ R

holds for all

u \in X \cup Y

and

v \in Y \cup Z

. On bounded and topologically separated sets (i.e., not having common limit points), this condition always holds, and on them,

Δ_{p, q}

becomes an inframetric as explained below.

Corollary 2.

Under the same hypotheses of Theorem 2, the

(p, q)

-averaged Hausdorff distance

Δ_{p, q}

satisfies

Δ_{p, q} (X, Z) ⩽ \frac{R^{2}}{r^{2}} (Δ_{p, q} (X, Y) + Δ_{p, q} (Y, Z)) .

Proof.

It is immediate using Theorem 2 and Definition 7. □

When the involved sets are finite, a generally sharper inframetric relation holds. For emphasis, we employ in this context the notation

A, B, C

for those subsets of

S

.

Theorem 3.

If

p, q \in R

and

| p |, | q | > 1

, the

(p, q)

-distance

{GD}_{p, q}

satisfies the relaxed triangle inequality

{GD}_{p, q} (A, C) ⩽ N^{α} ({GD}_{p, q} (A, B) + {GD}_{p, q} (B, C)),

for all finite subsets

A, B, C \subset S

, where

N : = max {| A |, | B |, | C |} ⩾ 1

and

{α : = | p |}^{- 1} + {| q |}^{- 1}

.

Proof.

For arbitrary

p \neq 0

, let us assume that

q < 0

, so that

| q | = - q

. We can write

\begin{matrix} {GD}_{p, | q |} (A, B) = {(\sum_{a \in A} {({[\sum_{b \in B} d {(a, b)}^{- q}]}^{- 1})}^{\frac{p}{q}})}^{\frac{1}{p}} = {(\sum_{a \in A} {({| B |}^{- 2} \underset{b \in B}{harm} \{d {(a, b)}^{q}\})}^{\frac{p}{q}})}^{\frac{1}{p}}, \end{matrix}

which, when combined with property 5 of Proposition 1, yields

\begin{matrix} {GD}_{p, | q |} (A, B) ⩽ {(\sum_{a \in A} {({| B |}^{- 1} min_{b \in B} \{d {(a, b)}^{q}\})}^{\frac{p}{q}})}^{\frac{1}{p}} ⩽ {| B |}^{\frac{1}{| q |}} {(\sum_{a \in A} {(\sum_{b \in B} d {(a, b)}^{q})}^{\frac{p}{q}})}^{\frac{1}{p}} = {| B |}^{\frac{1}{| q |}} {GD}_{p, q} (A, B) . \end{matrix}

A similar relation is true for any

q \neq 0

if

p < 0

. In conclusion,

{GD}_{| p |, | q |} (A, B) ⩽ N^{α} {GD}_{p, q} (A, B)

, where

α : = \{\begin{matrix} {| min {p, q} |}^{- 1} & if p q < 0, \\ {| p |}^{- 1} + {| q |}^{- 1} & if p < 0, q < 0 . \end{matrix}

If

N^{α}

does not need to be sharp,

α

can always be chosen to take the larger value

{| p |}^{- 1} + {| q |}^{- 1}

.

Now, for

| p |, | q | ⩾ 1

, the final result follows from the triangle inequality for

{GD}_{| p |, | q |}

:

\begin{matrix} {GD}_{p, q} (A, C) ⩽ {GD}_{| p |, | q |} (A, C) ⩽ N^{α} ({GD}_{p, q} (A, B) + {GD}_{p, q} (B, C)) . □ \end{matrix}

Corollary 3.

If

p, q \in R

and

| p |, | q | ⩾ 1

, the

(p, q)

-distance

Δ_{p, q}

satisfies the relaxed triangle inequality:

Δ_{p, q} (A, C) ⩽ N^{α} (Δ_{p, q} (A, B) + Δ_{p, q} (B, C)),

for all finite subsets

A, B, C \subset S

, with

N : = max {| A |, | B |, | C |} ⩾ 1

, and

{α : = | p |}^{- 1} + {| q |}^{- 1}

.

Proof.

The corollary follows immediately from Theorem 3 and Definition 5. □

To conclude this section, we return to the general setting of arbitrary measurable sets to explain the behavior of

Δ_{p, q}

when changing the value of its parameters p and q.

Theorem 4.

Let

X, Y \in M_{< \infty} (S)

and suppose that

p, p^{'}, q, q^{'} \in \bar{R}

satisfy

p ⩽ p^{'}

and

q ⩽ q^{'}

. Then,

Δ_{p, q} (X, Y) ⩽ Δ_{p^{'}, q} (X, Y) a n d Δ_{p, q} (X, Y) ⩽ Δ_{p, q^{'}} (X, Y) .

Proof.

It follows easily from part 5 of Proposition 2 and Definition 7. □

6. The (p,q)-Distances as Quality Indicators

Let Q be a decision space Q and

F : Q \to R^{k}

be a multi-objective function on it, of which the associated MOP consists in the simultaneous minimization of its k component functions

f_{1}, \dots, f_{k}

. A candidate solution to this problem is Pareto-optimal if all elements of its image in

F (Q) \subset R^{k}

are nondominated in the sense of Pareto [31]; see Definition 1. For the forthcoming discussion, let us introduce the following abbreviated and useful notation. For

X, Y \subset Q

and any

z \in Q

, we define the following

\begin{matrix} X_{⪯ z} : = & {x \in X ∣ x ⪯ z}, & X_{⪯ Y} : = & {x \in X ∣ \exists y \in Y : x ⪯ y}, \\ X_{⋠ z} : = & {x \in X ∣ x ⋠ z}, & X_{⋠ Y} & {x \in X ∣ ∄ y \in Y : x ⪯ y} . \end{matrix}

From these definitions, it follows that, for arbitrary

z \in Q

and

X, Y \subset Q

, there are partitions:

X = X_{⪯ z} ⊔ X_{⋠ z}, and X = X_{⪯ Y} ⊔ X_{⋠ Y},

where ⊔ stands for the disjoint union of subsets. A similar notation with the subindices ≺, ≻, and ⪰ can also be used in an analogous way. Let us recall that an archive

X \subset Q

is, by definition, a subset of mutually non-dominated points; therefore, for any

x, x^{'} \in X

, the condition

x ⪯ x^{'}

implies

x = x^{'}

. This basic property implies that

F : Q \to R^{k}

is a bijection when restricted to any archive

X \subset Q

and, therefore, the points in

F (X) \subset F (Q)

can be univocally labeled by the elements of X. Moreover, for a finite archive

A \subset Q

, both sets have the same number of elements

| A | = | F (A) |

.

Now, we introduce a couple of strengthened notions of dominance between sets (archives) that are required for the validity of most of the results in this section.

Definition 8.

An archive X is well-dominated by an archive Y if

1.: X is dominated by Y, written $Y ⪯ X$ , i.e., $\forall x \in X$ , $\exists y \in Y$ s.t. $y ⪯ x$ , and
2.: Y consists only of dominating points of X, i.e., $\forall y \in Y$ , $\exists x \in X$ s.t. $y ⪯ x$ .

Moreover, X is said to be strictly well dominated by Y if

3.: $\exists y \in Y \ X$ , $\exists x \in X \ Y$ such that $y ≺ x$ .

For an archive

X \subset Q

, the

{GD}_{p, q}

,

{IGD}_{p, q}

, and

Δ_{p, q}

quality (or performance) indicators assigned to it will be defined as the distance of its image

F (X)

to the Pareto front

F (P)

, i.e.,

\begin{matrix} I_{p, q}^{GD} (X) : = {GD}_{p, q} (F (X), F (P)), I_{p, q}^{IGD} (X) : = {IGD}_{p, q} (F (X), F (P)), \\ and I_{p, q}^{Δ} (X) : = Δ_{p, q} (F (X), F (P)) . \end{matrix}

In this section, we study the behavior of

I_{p, q}^{GD}

,

I_{p, q}^{IGD}

, and

I_{p, q}^{Δ}

as performance indicators. An example of a weakly Pareto-compliant performance indicator is the Degree of Approximation (DOA; see Reference [10]).

6.1. Pareto Compliance of $(p, q)$ -Indicators in the Finite Case

In order to obtain general conclusions on the features of the averaged Hausdorff distance

Δ_{p, q}

as a quality indicator, we consider first the behavior of

{GD}_{p, q}

. For additional details on the material presented in this section and other related results in the context of the p-averaged Hausdorff Distance

Δ_{p}

, the reader is referred to Reference [28].

For the following statements, we will abbreviate

δ_{q} (a, B) : = {(\sum_{b \in B} d {(a, b)}^{q})}^{\frac{1}{q}}

. Clearly, with this notation,

I_{p, q}^{GD} (A) = {(\sum_{a \in A} δ_{q} {(F (a), F (P))}^{p})}^{\frac{1}{p}}

, where in the averaged sum

\sum

, we are labeling the points in

F (A)

by the elements of the archive A, taking advantage of the fact that

| A | = | F (A) |

, as it also will be done with all the averages in this section.

Theorem 5.

Let

A, B \subset Q

be finite archives with A strictly well dominated by B. For all

a \in A

and

b \in B

,

1.: $b ≺ a$ implies that $δ_{q} (F (b), F (P)) < δ_{q} (F (a), F (P))$ ;
2.: $b ⪯ a$ implies that $\frac{| B_{⪯ a} |}{| B |} ⩽ \frac{| A_{⪰ b} |}{| A |}$ (or equivalently an strict equality);

then,

I_{p, q}^{GD} (B) < I_{p, q}^{GD} (A)

.

Proof.

By condition 1, for all

a \in A

and

b \in B_{⪯ a}

, the inequality

δ_{q} {(F (b), F (P))}^{p} ⩽ δ_{q} {(F (a), F (P))}^{p}

holds true. After averaging over all

b \in B_{⪯ a}

at both sides, we have

\sum_{b \in B_{⪯ a}} δ_{q} {(F (b), F (P))}^{p} ⩽ δ_{q} {(F (a), F (P))}^{p},

and averaging once again over all

a \in A

produces

\sum_{a \in A} (\sum_{b \in B_{⪯ a}} δ_{q} {(F (b), F (P))}^{p}) ⩽ \sum_{a \in A} δ_{q} {(F (a), F (P))}^{p} .

(9)

From property 2 and noticing that each

b \in B

appears

| A_{⪰ b} |

times in the initial sum, the lhs becomes

\sum_{a \in A} \sum_{b \in B_{⪯ a}} \frac{1}{| B_{⪯ a} |} δ_{q} {(F (b), F (P))}^{p} \geq \sum_{a \in A} \sum_{b \in B_{⪯ a}} \frac{| A |}{| B |} \frac{1}{| A_{⪰ b} |} δ_{q} {(F (b), F (P))}^{p} = \sum_{b \in B} δ_{q} {(F (b), F (P))}^{p} .

Returning to Equation (9), we conclude that

I_{p, q}^{GD} (B) ⩽ I_{p, q}^{GD} (A)

. Lastly, part 3 of Definition 8 for strictly well-dominated sets guarantees that this is an strict inequality, proving the assertion. □

Remark 4.

Since we are dealing with finite archives, condition 2 of Theorem 5 regarding the relative size of some of their parts is equivalent to the condition

| F (B_{⪯ a}) | / | F (B) | ⩽ | F (A_{⪰ b}) | / | F (A) |

regarding the relative size of their images. This is not necessarily the case in the context of measurable subsets, but see Remark 5.

Figure 3 shows examples where Theorem 5 holds with

q \to - \infty

. In this case, the q-averaged distance

δ_{q} (a, B)

becomes the standard distance

d (a, B)

between a point and a set.

For the inverted generational distance

{IGD}_{p, q}

in the finite case, we provide here two useful results without explicit proofs. The necessary steps are similar to the arguments used to prove the analogous statements for

{IGD}_{p}

in Prop. 3.8 of Reference [28] and Thm. 3.9 in Reference [28]. Those statements correspond here to the limit

q \to - \infty

, and the main difference in the proofs is that the Euclidean distante

d (a, B)

needs to be changed everywhere by the q-average

δ_{q} (a, B)

, as it was done above for the proof of Theorem 5 that generalizes the proof of Thm. 3.4 in Reference [28]. The reader can also find there additional remarks on similar hypotheses to the ones needed for Theorem 6 below.

Proposition 3.

Let

A, B \subset Q

be finite and strictly well-dominated archives with

B ⪯ A

such that for all

a \in A

,

b \in B

, and

x \in P

:

b ≺ a implies δ_{q} (F (b), F (x)) < δ_{q} (F (a), F (x));

then,

I_{p, q}^{IGD} (B) < I_{p, q}^{IGD} (A)

.

Figure 4 illustrates two situations where the hypotheses of Proposition 3 are satisfied. Now, to state a more general result concerning the Pareto compliance of the

{IGD}_{p, q}

indicator, we will further abbreviate the minimal p-average of distances

δ_{q} (F (a), F (P_{⋠ B}))

by

δ_{A} : = min_{a \in A} \{{(\sum_{x \in P_{B}} δ_{q} {(F (x), F (a))}^{p})}^{\frac{1}{p}}\} = min_{a \in A} {IGD}_{p, q} (F (a), F (P_{⋠ B})) .

Theorem 6.

Let

A, B \subset Q

be finite and strictly well-dominated archives such that

B ⪯ A

. If at least one of the following conditions is satisfied,

1.: $\forall a \in A$ , $\forall b \in B$ : $b ≺ a implies {IGD}_{p, q} (F (b), F (P_{⋠ B})) < {IGD}_{p, q} (F (a), F (P_{⋠ B}))$ ;
2.: $\exists a_{0} \in A$ such that $\forall x \in P_{⋠ B}$ : $a_{0} \in {arg min}_{a \in A} δ_{q} (F (x), F (a))$ ;
3.: $\forall x \in P_{⋠ B}$ : $δ_{q} (F (A), F (x)) = δ_{A}$ ;

then

I_{p, q}^{IGD} (B) < I_{p, q}^{IGD} (A)

.

Finally, a general statement on the Pareto compliance of the finite case of the

(p, q)

-averaged Hausdorff distance

Δ_{p, q}

follows as a consequence of Theorems 5 and 6.

Theorem 7.

Let

A, B \subset Q

be finite and well-dominated archives such that

B ⪯ A

. If for all

a \in A

,

b \in B

:

b ⪯ a implies \frac{| B_{⪯ a} |}{| B |} = \frac{| A_{⪰ b} |}{| A |},

and at least one of the following conditions is satisfied:

1.: $\forall a \in A$ , $\forall b \in B$ , $\forall x \in P$ : $b ≺ a implies δ_{q} (F (b), F (x)) < δ_{q} (F (a), F (x))$ ;
2.: $\exists a_{0} \in A$ such that $\forall x \in P_{⋠ B}$ : $a_{0} \in {arg min}_{a \in A} δ_{q} (F (x), F (a))$ ;
3.: $\forall x \in P_{⋠ B}$ : $δ_{q} (F (A), F (x)) = δ_{A}$ ;

then,

I_{p, q}^{Δ} (B) < I_{p, q}^{Δ} (A)

.

Figure 5 illustrates four situations where Theorems 6 and 7 apply with very large q. In the first row, the left diagram is a modification of the second case in Figure 4 where condition 1 holds. In the right diagram, the diamond lying at the lower left corner of

F (A)

represents the image

F (a_{0})

of a point

a_{0}

satisfying condition 2. Finally, both diagrams in the second row exhibit cases where the points of

F (P)

are equidistant to corresponding points in

F (A)

, making condition 3 valid, with

δ_{A}

being this distance.

6.2. Pareto Compliance of $(p, q)$ -Indicators in the General Case

We consider now the behavior of the generalized

{GD}_{p, q}

distance with respect to the Pareto-compliance, concentrating on the most important aspects that describe its characteristics and using similar hypotheses to the ones needed in the previous section for

Δ_{p, q}

in the finite case.

Here, we will continue to assume that the decision space Q with objective function

F : Q \to R^{k}

defining the MOP under consideration has a Pareto set

P \subset Q

with corresponding Pareto front

F (P) \subset F (Q)

. Also, we assume that the objective space

F (Q) \subset R^{k}

carries a metric d that, for simplicity, can be taken to be the one inherited from the Euclidean distance

d (\cdot, \cdot)

in

R^{k}

. In addition, to define the

(p, q)

-indicators on MOPs that require general non-finite sets, we need a measure space

(S, μ)

, that here will be taken to be

S F (Q)

endowed with a non-null measure

μ

according to the comments at the beginning of Section 4.2. In this context X,

Y \subset Q

will denote arbitrary subsets such that

F (X), F (Y) \subset F (Q)

are measurable with non-null and finite measures.

Remark 5.

Recall that, here,

| F (X) | = μ (F (X))

denotes the measure of

F (X) \subset S

. In this context, Q will not be asked to carry a measure and the notation

| X |

will have no a priori meaning for

X \subset Q

. Nevertheless, it is possible to induce a measure on those subsets of Q where F is bijective by taking the pullback

μ^{*}

of μ to them, making the identity

| X | = μ^{*} (X) : = μ (X) = | F (X) |

trivially true. This can be done for all archives but not for subsets where F is not bijective. When it is Q that carries a measure, a push-forward measure can be always defined on its image

F (Q)

, making this identity true for all sets. This was implicitly assumed in the presentation provided in Reference [18] (Section 3.4). For clarity, we avoid here this identification and state everything from the assumption that the measure μ is defined only on

S : = F (Q)

.

Before stating the complete result, let us recall that a partition of a set X is a collection of disjoint and non-empty subsets of X whose union is the whole of X and a partition of an archive

X \subset Q

induces a partition of

F (X) \subset F (Q)

by the bijectivity of F restricted to X. For convenience, we abbreviate the measure-theoretic q-averaged distance from a point

F (x) \in F (Q)

to a set

F (Z) \subset F (Q)

by

δ_{q} (F (x), F (Z)) : = {(- \int_{v \in F (Z)} d {(F (x), v)}^{q})}^{\frac{1}{q}}

.

Theorem 8.

For

p, q \in \bar{R}

, let

X, Y \subset Q

denote archives of which the images

F (X)

and

F (Y)

are of non-null finite measures in

F (Q)

. Moreover, assume that

1.

there exist finite partitions

X = ⨆_{i = 1}^{m} X_{i}

and

Y = ⨆_{i = 1}^{m} Y_{i}

such that

\forall i \in {1, \dots, m}

:

(a): $F (X_{i}) \subset F (X)$ and $F (Y_{i}) \subset F (Y)$ are subsets of non-null finite measure in $F (Q)$ ;
(b): $\forall x \in X_{i}$ , $\forall y \in Y_{i}$ : $x ⪯ y$ ;

2.

\forall x \in X

,

\forall y \in Y

:

x ⪯ y implies δ_{q} (F (x), F (P)) ⩽ δ_{q} (F (y), F (P))

;

then,

I_{p, q}^{GD} (X) ⩽ I_{p, q}^{GD} (Y)

.

Proof.

By

1 (a)

of Theorem 8, the sets X and Y can be subdivided into the same number m of subsets, and by

1 (b)

, if

x \in X_{i}

and

y \in Y_{i}

for any

i \in {1, \dots, m}

, then

δ_{q} (F (x), F (P)) ⩽ δ_{q} (F (y), F (P))

. Therefore, we can take successive integral p-averages over

F (X_{i})

and, afterwards, over

F (Y_{i})

at both sides of this inequality to find that, for each i, we have

a_{i}^{p} : = \frac{1}{| F (X_{i}) |} \int_{F (X_{i})} δ_{q} {(v, F (P))}^{p} d v ⩽ \frac{1}{| F (Y_{i}) |} \int_{F (Y_{i})} δ_{q} {(v, F (P))}^{p} d v = : b_{i}^{p} .

(10)

For those

i \in {1, \dots, m}

violating the inequality

\frac{| X_{i} |}{| X |} ⩽ \frac{| Y_{i} |}{| Y |}

, we subdivide

X_{i}

into a sufficiently large partition of

m_{i}

subsets

X_{i, 1}, X_{i, 2}, \dots, X_{i, m_{i}}

, with images by F of non-null finite measure, so as to guarantee that, for all

j \in {1, \dots, m}

, we get

w_{i, j} : = \frac{| F (X_{i, j}) |}{| F (X) |} ⩽ \frac{| F (Y_{i}) |}{| F (Y) |} = : {\tilde{w}}_{i} .

(11)

Notice that this is indeed possible because each

F (X_{i})

is of non-null finite measure. Since

\forall x \in X_{i, j}

,

\forall y \in Y_{i}

, we have

x ⪯ y

, an inequality similar to Equation (10) also holds for them, i.e.,

a_{i, j}^{p} : = \frac{1}{| F (X_{i, j}) |} \int_{F (X_{i, j})} δ_{q} {(v, F (P))}^{p} d v ⩽ \frac{1}{| F (Y_{i}) |} \int_{F (Y_{i})} δ_{q} {(v, F (P))}^{p} d v = : b_{i}^{p},

for all

i \in {1, \dots, n}

,

j \in {1, \dots, m_{i}}

. However,

| F (X) | = \sum_{i = 1}^{m} | F (X_{i}) |

, where

| F (X_{i}) | = \sum_{j = 1}^{m_{i}} | F (X_{i, j}) |

and

| F (Y) | = \sum_{i = 1}^{m} | F (Y_{i}) |

. Therefore, with the notation of Equation (11), a simple calculation shows that

\sum_{i = 1}^{m} \sum_{j = 1}^{m_{i}} w_{i, j} = \sum_{i = 1}^{m} {\tilde{w}}_{i} = 1

, implying that

w_{i, j}

and

{\tilde{w}}_{i}

are normalized weights useful for weighted averages. Since

0 ⩽ a_{i, j} ⩽ b_{i}

and

0 ⩽ w_{i, j} ⩽ {\tilde{w}}_{i} ⩽ 1

, simple properties of discrete weighted power mean imply the inequality

\sum_{i = 1}^{m} \sum_{j = 1}^{m_{i}} w_{i, j} a_{i, j}^{p} ⩽ \sum_{i = 1}^{m} {\tilde{w}}_{i} b_{i}^{p}

. Thus, we can finally write

\begin{matrix} I_{p, q}^{GD} {(F (Y))}^{p} & = \frac{1}{| F (X) |} \sum_{i = 1}^{m} \sum_{j = 1}^{m_{i}} \int_{F (X_{i, j})} δ_{q} {(v, F (P))}^{p} d v = \sum_{i = 1}^{m} \sum_{j = 1}^{m_{j}} \frac{| F (X_{i, j}) |}{| F (X) |} a_{i, j}^{p} = \sum_{i = 1}^{m} \sum_{j = 1}^{m_{i}} w_{i, j} a_{i, j}^{p} \\ ⩽ \sum_{i = 1}^{m} {\tilde{w}}_{i} b_{i}^{p} = \sum_{i = 1}^{m} \frac{| F (Y_{i}) |}{| F (Y) |} b_{i}^{p} = \frac{1}{| F (Y) |} \sum_{i = 1}^{m} \int_{F (Y_{i})} δ_{q} {(v, F (P))}^{p} d v = I_{p, q}^{GD} {(Y)}^{p} . \end{matrix}

□

Remark 6.

From condition 1 of Theorem 8, it follows the simpler (and somewhat weaker) dominance conditions:

( $a^{'}$ ): $X ⪯ Y$ (i.e., $\forall y \in Y$ , $\exists x \in X$ such that $x ⪯ y$ ), and
( $b^{'}$ ): $\forall x \in X$ , $\exists y \in Y$ such that $x ⪯ y$ .

For simple situations where (

a^{'}

) and (

b^{'}

) are valid, the partitions needed for part 1 of Theorem 8 are not difficult to find; however, this is not always possible as the right side of Figure 6 indicates. Indeed, Figure 6 presents some examples where (

a^{'}

) and (

b^{'}

) hold true, but

I_{p, q}^{GD} (X) ⩽ I_{p, q}^{GD} (Y)

can be both true (left side) and false (right side). Furthermore, it is possible to show that X and Y comply (left side) and do not comply (right side) with condition 1 of Theorem 8, respectively.

Remark 7.

An important advantage of using

{GD}_{p, q}

over

{GD}_{p}

is that condition 2 of Theorem 8 provides the possibility of choosing an appropriate

q \in \bar{R}

for which the condition

δ_{q} (F (x), F (P)) ⩽ δ_{q} (F (y), F (P))

holds when

x ⪯ y

, ensuring in this way the compliance to Pareto optimality for

{GD}_{p, q}

. This freedom is lacking for

{GD}_{p}

because, in the limit

q \to - \infty

, the distance

δ_{q} (F (x), F (P))

becomes the standard distance

d (F (x), F (P))

, which does not allow for any choice.

7. Examples and Numerical Experiments

In this section, we present some numerical experiments involving finite sets first, and afterwards, we study the case of continuous sets.

7.1. Working with $Δ_{p, q}$ over Finite Sets

Let us take a hypothetical Pareto front P given by the line segment from

(0, 1)

to

(1, 0)

in

R^{2}

, i.e., the set of all points

(t, 1 - t) \in R^{2}, for 0 ⩽ t ⩽ 1 .

This is the same example considered in Reference [16] p. 506 and enables us to make a comparison with values of

Δ_{p}

. In order to use the finite version of

Δ_{p, q}

, we discretize P by taking 11 uniformly distributed points; we call this set

P^{'}

. We assume two archives:

X_{1}

is obtained from

P^{'}

by changing

(0, 1)

for

(0, 10)

, including an outlier, and by adding

1 / 10

to the remaining ordinates.

X_{2}

is obtained from

P^{'}

by adding 5 to each ordinate. See Figure 7.

As explained in Section 3, we know that

Δ_{\infty} (A, B) lim_{p \to \infty} Δ_{p} (A, B)

coincides with the standard Hausdorff distance

d_{H}

. In this case, we obtained

\begin{matrix} Δ_{1} (P^{'}, X_{1}) & = 0.9091, & Δ_{1} (P^{'}, X_{2}) & = 4.5412, \\ d_{H} (P^{'}, X_{1}) & = 9, & d_{H} (P^{'}, X_{2}) & = 5; \end{matrix}

and according to Theorem 4 and Reference [16] p. 512, these values must increase as p increases.

Table 1 and Table 2 show that we can find values of p and q such that the

(p, q)

-averaged distance does not punish heavily the outliers, for example,

p = q = 1

or

p = 1

and

q = - 1

. We remark that the values of

Δ_{p, q} (P^{'}, X_{1})

do not present a significative change under variations of

q ⩽ 1

for a fixed p. Thus, it is possible to work with

q = 1

, in which case

Δ_{p, q}

is a metric according to Corollary 1, and to still obtain values close to the ones given by the inframetric

Δ_{p}

, with the same

p ⩾ 1

.

For large values of p, the behavior of

Δ_{p, q}

presents the same disadvantages of

Δ_{p}

or of the standard Hausdorff distance. For example, in Table 1 and Table 2, it can be observed that all distances for

p ⩾ 5

are useless because they imply that the distance from the discrete Pareto front

P^{'}

to the archive

X_{1}

is larger than its distance to the archive

X_{2}

. Figure 7 suggests that this is an undesirable outcome.

Table 3 shows that

Δ_{p, q}

is close to a metric when

q ⩽ - 1

and

p ⩾ 1

. The percentage of triangle inequality violations decreases as p increases or q decreases.

7.2. Optimal Archives for Discretized Spherical Pareto Fronts

We now consider two standard Pareto fronts: The spheric convex and spheric concave quarter-circles, see Figure 8 and Figure 9.

\begin{matrix} P_{1} & = \{(cos (θ) + 1, sin (θ) + 1) : - π ⩽ θ ⩽ - π 2\}, \\ P_{2} & = \{(cos (θ), sin (θ)) : 0 ⩽ θ ⩽ π 2\} . \end{matrix}

(12)

To numerically find the optimal

Δ_{p, q}

archive of size M, we discretized the Pareto front with 1000 equidistant points (which is an acceptable discretization according to Reference [63] p. 603) and randomly chose an initial M sized archive. Then, we used a random-walk (or step climber) evolutionary algorithm, moving one point at a time. Finally, we refined the optimal archive with the “evenly spaced” construction suggested by Reference [63] p. 607.

When finding optimal

Δ_{p, q}

archives, our numerical experiments suggest a clear geometrical influence of the parameters p and q. For values of p in

(- \infty, - 1)

, the optimal archive sets are basically the same. When

q \in [- 1, 1]

increases, the optimal archive tends to lose dispersion, converging to one point. When

q ⩾ 1

, the optimal archive collapses to one point, and when

q \in (- \infty, - 1]

, the corresponding optimal archives are basically the same (see Figure 10). When

p ⩾ - 1

increases, the optimal archive moves away from the Pareto set (see Figure 11).

The following Figure 10 and Figure 11 show certain “optimal” archives A for the Pareto front

P_{1}

in Equation (12), where the optimality means that the distance

Δ_{p, q} (X, P_{1})

is minimum when

X = A

. Because of the choice of the parameters p and q, this solution is clearly different from the one shown in the Figure 8.

7.3. Optimal Archives for Disconnected and Discretized Pareto Sets

In this section, we present the optimal

Δ_{p, q}

archives for a disconnected step Pareto front:

P_{3}^{(s, γ)} = \{(t, 1 - γ t + (γ - 1) \frac{⌊ s t ⌋}{s}) : 0 ⩽ t ⩽ 1\},

(13)

where s is the number of steps,

γ > 0

is a small constant responsible for the step’s twist, and

⌊ \cdot ⌋

stands for the integer part function.

Figure 12 shows numerical optimal

Δ_{1, - 1}

archives of sizes 20. The archive coordinates reveal that

A \cap P_{3}^{(5, \frac{1}{10})} = ⌀,

i.e., the optimal archive points do not lie over the Pareto front but they are so close to it that this is hardly noticeable. It is also evident that the archives are evenly distributed along the Pareto front.

7.4. General Example for Continuous Sets

In this first example, we are going to construct simple and illustrative continuous sets A and B. Let A be the straight segment in

R^{2}

from

a = (- 1, 0)

to

b = (1, 0)

, that is

A = \bar{a b} .

(14)

For a small positive

ε > 0

and a variable

δ > 0

, let

B_{δ} \subset R^{2}

be the set given by the following union of straight segments

B_{δ} = \bar{c d_{δ}} \cup \bar{e_{δ} f_{δ}} \cup \bar{g_{δ} h},

(15)

where

c = (- 1, ε)

,

d_{δ} = (- δ, ε)

,

e_{δ} = (- δ, 1)

,

f_{δ} = (δ, 1)

,

g_{δ} = (δ, ε)

, and

h = (1, ε)

. We can regard the set

B_{δ}

as a continuous approximation of A, where the central segment

\bar{e_{δ} f_{δ}}

can be seen as the outlier.

In the following Figure 13, we can see the sets A and

B_{δ}

for

ε = 0.10

and

δ = 0.10, 0.20

. According to Table 4, as

δ

decreases, the

Δ_{p, q}

distance between the approximation

B_{δ}

and the set A also decreases.

We remark that the classical Hausdorff distance between A and

B_{δ}

produces the value 1 for any

δ > 0

. Thus, by working with the

(p, q)

-distance instead of

d_{H}

, we can detect “better” approximations.

7.5. Approximating Pareto Set and Front of a MOP

Finally, we address the problem to approximate the Pareto sets and fronts of given MOPs. As an example, we will consider the bi-objective Lamé super-sphere problem [32] which is defined as follows:

min_{x} F : R^{n} ⟶ R^{2},

(16)

where

F (x) = (f_{1} (x), f_{2} (x))

is defined as

f_{1} (x) = {(\frac{1}{n} \sum_{i = 1}^{n} x_{i}^{2})}^{\frac{γ}{2}} and f_{2} (x) = {(\frac{1}{n} \sum_{i = 1}^{n} {(x_{i} - 1)}^{2})}^{\frac{γ}{2}}

where

x \in R^{n}

and

γ \in R

. For

n = 2

, the Pareto sets and fronts of this problem are shown in Figure 14 and Figure 15 for

γ = 2

and

γ = 1 / 2

, respectively.

In a first step, we discuss the principle difference of discrete and continuous archives when approximating the Pareto set/front on a hypothetical example. For this, we assume that we are given the 5-element archive

A = {x_{1}, \dots, x_{5}} \subset R^{2}

; those elements are given by

x_{1} = (- 0.02, 0.03), x_{2} = (0.29, 0.20), x_{3} = (0.41, 0.49), x_{4} = (0.70, 0.62), x_{5} = (1.02, 0.98) .

Hence, we can see A as a 5-element approximation of the Pareto set and its image

F (A)

as a 5-element approximation of the Pareto front. Now, instead of A, we may use a polygonal curve that is defined by A:

B : = \bar{x_{1} x_{2}} \cup \dots \cup \bar{x_{4} x_{5}} .

In the following, we will call A a discrete archive while we call the polygon approximation B the continuous archive. Figure 16 and Figure 17 show the approximations A and B as well as their images

F (A)

and

F (B)

. Apparently, the approximation qualities are much better for the linear interpolates. This impression gets confirmed by the values of

Δ_{p, q}

for this problem that are shown in Table 5. We can observe the following two behaviors: (i) the distances are much better for the continuous archives and the differences are even larger in objective space, and (ii) the distances decrease with decreasing q (which is in accordance to the result of Theorem 4).

In a next step, we consider discrete archives that have been generated from multi-objective evolutionary algorithms together with their resulting continuous archives. For multi-objective evolutionary algorithms, we have chosen the widely used methods NSGA-II [65] and MOEA/D [66]. We stress, however, that any other MOEA could be chosen and that the conclusions we draw out by our results apply in principle to any other such algorithm. Table 6 shows the parameter setting we have used for our studies.

Figure 18 and Figure 19 and Table 7 show the results of NSGA-II where we have used 500 generations and a population size of 12. Figure 20 and Figure 21 and Table 8 show the respective results for MOEA/D where we have also used 500 generations and population size 12. We can see that, for both algorithms, the

Δ_{p, q}

values are significantly better for the continuous archives. We can also make another observation: the

Δ_{p, q}

values oscillate for the results of the dominance-based algorithm NSGA-II which is indeed typical. For the continuous archives, these oscillations are less notorious, which indicates that the use of continuous archives may have a smoothing effect on the approximations, which is highly desired.

We want to investigate the last statement further on. To this end, we consider the following convex bi-objective problem: the objectives are given by

f_{1}, f_{2} : R^{3} \to R

, where

\begin{matrix} f_{1} (x) & = {(x_{1} + 1)}^{2} + x_{2}^{2} + x_{3}^{2} \\ f_{2} (x) & = {(x_{1} - 1)}^{2} + x_{2}^{2} + x_{3}^{2} . \end{matrix}

(17)

Figure 22 shows the Pareto set and front of MOP (Equation (17)). The Pareto set is given by the straight segment joining

(0, 0, 0)

and

(1, 0, 0)

.

The values of

Δ_{p, q}

obtained by NSGA-II for the discrete archives (using population size 20) as well as for the respective continuous archives can be seen in Figure 23 and Table 9. Also for this example, the values for the continuous archives are much better and the oscillations are significantly reduced compared to the discrete archives.

Figure 24, Figure 25 and Figure 26 show the results of both kinds of archives after 300, 400, and 500 generations which confirm this observation. The results show that NSGA-II is indeed capable of computing points near the Pareto front while the distribution of the points vary. This is a known fact since there exists no “limit archive” for this algorithm (as it is, e.g., not based on the averaged Hausdorff distance or any other performance indicator). When considering the respective results of the continuous archives, however, NSGA-II computes (at least visually) nearly perfect approximations of the Pareto front. The

Δ_{p, q}

values reflect this.

8. Conclusions and Perspectives

In this paper, we have presented a comprehensive overview of the averaged Hausdorff distances that have recently appeared in connection with the study of MOPs.

Among the averaged Hausdorff distances studied here, the generalized

Δ_{p, q}

as defined for arbitrary measurable sets was shown to provide a general and robust definition for applications that carries good metric properties, is adequate for use with continuous approximations of the Pareto set of a MOP, and even reduces to the previously introduced definition for discrete approximations.

Concerning the appearance of the additional parameter q in the definition of

Δ_{p, q}

which could give the impression of an overly complicated expression, it is important to highlight, as it was observed in Remark 7, that it can provide the possibility to choose a suitable value of q in order to make

{GD}_{p, q}

as Pareto compliant as possible for the MOP under consideration. This is an argument in favor of the flexibility provided by the generalized version

{GD}_{p, q}

, which is not available for the

{GD}_{p}

distance, and this particular aspect worths further investigation.

Nevertheless, since the freedom provided by the two parameters p and q may appear as excessive and perhaps undesirable in many applications, there remains to find a practical recipe to determine and fix these parameters according to the characteristics of the problem under consideration. Certainly, the desired spreads of the optimal archives, the distance of an approximation to the Pareto front, and the convexity of these fronts need to be taken into account in order to determine an appropriate set of preferred values for these parameters depending on the situation.

To achieve these aims, more theoretical as well as numerical studies of optimal solutions associated with Pareto fronts with different convexities must be carried out and experiments evaluating how the Pareto compliance can be enhanced in each situation by the choice of parameters need to be performed.

Finally, we stress that the results we have shown in Section 7 show the advantage of a new performance indicator that is able to compute the performance of a continuous approximation of the solution set. Continuous approximations, e.g., of the Pareto set/front of multi-objective optimization problems have not been considered so far, though both Pareto set and front typically form continuous sets in case the objectives are continuous. The examples have indicated that the consideration of continuous archives (via use of interpolation on the populations generated by the evolutionary algorithms) could allow a reduction in population sizes and, hence, a significant reduction of the computational effort of the evolutionary algorithms. This is because the time complexity for all existing multi-objective evolutionary algorithms is quadratic in the population size and in each generation of the algorithm. To verify this statement, more computations are needed, which is left for future work.

Author Contributions

J.M.B. and A.V. obtained the theoretical results concerning the

(p, q)

-averaged Hausdorff distance, and O.S. conceived and designed the experiments; J.M.B. and A.V. performed the experiments and provided the related figures and tables; O.S. analyzed the data and contributed with the text. J.M.B. and A.V. wrote the paper.

Funding

The first two authors were partially supported by Vicerrectoría de Investigación, Pontificia Universidad Javeriana, Bogotá D.C., Colombia. The third author was supported by Conacyt Basic Science project No. 285599 and SEP Cinvestav project no. 231.

Conflicts of Interest

The authors declare no conflict of interest.

References

Heinonen, J. Lectures on Analysis on Metric Spaces; Springer: New York, NY, USA, 2001. [Google Scholar]
de Carvalho, F.; de Souza, R.; Chavent, M.; Lechevallier, Y. Adaptive Hausdorff distances and dynamic clustering of symbolic interval data. Pattern Recogn. Lett. 2006, 27, 167–179. [Google Scholar] [CrossRef]
Huttenlocher, D.P.; Klanderman, G.A.; Rucklidge, W.A. Comparing images using the Hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 1993, 15, 850–863. [Google Scholar] [CrossRef]
Yi, X.; Camps, O.I. Line-based recognition using a multidimensional Hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 1999, 21, 901–916. [Google Scholar]
Falconer, K. Fractal Geometry: Mathematical Foundations and Applications, 2nd ed.; Mathematical foundations and applications; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2003. [Google Scholar]
Aulbach, B.; Rasmussen, M.; Siegmund, S. Approximation of attractors of nonautonomous dynamical systems. Discrete Contin. Dyn. Syst. Ser. B 2005, 5, 215–238. [Google Scholar]
Dellnitz, M.; Hohmann, A. A subdivision algorithm for the computation of unstable manifolds and global attractors. Numerische Mathematik 1997, 75, 293–317. [Google Scholar] [CrossRef]
Emmerich, M.; Deutz, A.H. Test problems based on Lamé superspheres. In Proceedings of the 4th International Conference on Evolutionary Multi-criterion Optimization EMO’07, Matsushima, Japan, 5–8 March 2007; Springer: Berlin/Heidelberg, Germany, 2007; pp. 922–936. [Google Scholar]
Dellnitz, M.; Schütze, O.; Hestermeyer, T. Covering Pareto sets by multilevel subdivision techniques. J. Optim. Theory Appl. 2005, 124, 113–155. [Google Scholar] [CrossRef]
Dilettoso, E.; Rizzo, S.A.; Salerno, N. A weakly Pareto compliant quality indicator. Math. Comput. Appl. 2017, 22, 25. [Google Scholar] [CrossRef]
Padberg, K. Numerical Analysis of Transport in Dynamical Systems. Ph.D. Thesis, University of Paderborn, Paderborn, Germany, 2005. [Google Scholar]
Peitz, S.; Dellnitz, M. A survey of recent trends in multiobjective optimal control—Surrogate models, feedback control and objective reduction. Math. Comput. Appl. 2018, 23, 30. [Google Scholar] [CrossRef]
Schütze, O. Set Oriented Methods for Global Optimization. Ph.D. Thesis, University of Paderborn, Paderborn, Germany, 2004. [Google Scholar]
Schütze, O.; Coello Coello, C.A.; Mostaghim, S.; Talbi, E.G.; Dellnitz, M. Hybridizing evolutionary strategies with continuation methods for solving multi-objective problems. Eng. Optim. 2008, 40, 383–402. [Google Scholar] [CrossRef]
Schütze, O.; Laumanns, M.; Coello Coello, C.A.; Dellnitz, M.; Talbi, E.G. Convergence of stochastic search algorithms to finite size Pareto set approximations. J. Glob. Optim. 2008, 41, 559–577. [Google Scholar] [CrossRef]
Schütze, O.; Esquivel, X.; Lara, A.; Coello Coello, C.A. Using the averaged Hausdorff distance as a performance measure in evolutionary multiobjective optimization. IEEE Trans. Evol. Comput. 2012, 16, 504–522. [Google Scholar] [CrossRef]
Vargas, A.; Bogoya, J.M. A generalization of the averaged Hausdorff distance. Computación y Sistemas 2018, 22, 331–345. [Google Scholar] [CrossRef]
Bogoya, J.M.; Vargas, A.; Cuate, O.; Schütze, O. A (p,q)-averaged Hausdorff distance for arbitrary measurable sets. Math. Comput. Appl. 2018, 23, 51. [Google Scholar] [CrossRef]
Cai, X.; Li, Y.; Fan, Z.; Zhang, Q. An external archive guided multiobjective evolutionary algorithm based on decomposition for combinatorial optimization. IEEE Trans. Evolut. Comput. 2015, 19, 508–523. [Google Scholar]
Shang, R.; Wang, Y.; Wang, J.; Jiao, L.; Wang, S.; Qi, L. A multi-population cooperative coevolutionary algorithm for multi-objective capacitated arc routing problem. Inf. Sci. 2014, 277, 609–642. [Google Scholar] [CrossRef]
Zhang, J.; Tang, Q.; Li, P.; Deng, D.; Chen, Y. A modified MOEA/D approach to the solution of multi-objective optimal power flow problem. Appl. Soft Comput. 2016, 47, 494–514. [Google Scholar] [CrossRef]
Dhiman, G.; Kumar, V. Multi-objective spotted hyena optimizer: A Multi-objective optimization algorithm for engineering problems. Knowl. Based Syst. 2018, 150, 175–197. [Google Scholar] [CrossRef]
López-Rubio, F.J.; López-Rubio, E. Features for stochastic approximation based foreground detection. Comput. Vision Image Underst. 2015, 133, 30–50. [Google Scholar] [CrossRef]
Kerkhove, L.P.; Vanhoucke, M. Incentive contract design for projects: The owner’s perspective. Omega 2016, 62, 93–114. [Google Scholar] [CrossRef]
Hansen, M.P.; Jaszkiewicz, A. Evaluating the Quality of Approximations to the Non-Dominated Set; IMM, Department of Mathematical Modelling, Technical University of Denmark: Kongens Lyngby, Denmark, 1998. [Google Scholar]
Zitzler, E.; Thiele, L. Multiobjective evolutionary algorithms: A comparative case study and the strength Pareto approach. IEEE Trans. Evol. Comput. 1999, 3, 257–271. [Google Scholar] [CrossRef]
Siwel, J.; Yew-Soon, O.; Jie, Z.; Liang, F. Consistencies and contradictions of performance metrics in multiobjective optimization. IEEE Trans. Evol. Comput. 2014, 44, 2329–2404. [Google Scholar]
Vargas, A. On the Pareto compliance of the averaged Hausdorff distance as a performance indicator. Universitas Scientiarum 2018, 23, 333–354. [Google Scholar] [CrossRef]
Miettinen, K. Nonlinear Multiobjective Optimization; Kluwer Academic Publishers: Tranbjerg, Denmark, 1999. [Google Scholar]
Ehrgott, M.; Wiecek, M.M. Multiobjective programming. In Multiple Criteria Decision Analysis: State of the Art Surveys; Springer: New York, NY, USA, 2005; pp. 667–722. [Google Scholar]
Pareto, V. Manual of Political Economy; The Macmillan Press: London, UK, 1971. [Google Scholar]
Hillermeier, C. Nonlinear Multiobjective Optimization: A Generalized Homotopy Approach; Springer Science & Business Media: Berlin, Germany, 2001; Volume 135. [Google Scholar]
Köppen, M.; Yoshida, K. Many-objective particle swarm optimization by gradual leader selection. In Proceedings of the 8th international conference on adaptive and natural computing algorithms (ICANNGA 2007), Warsaw, Poland, 11–14 April 2007; Springer: Berlin/Heidelberg, Germany, 2007; pp. 323–331. [Google Scholar]
Schütze, O.; Lara, A.; Coello Coello, C.A. On the influence of the number of objectives on the hardness of a multiobjective optimization problem. IEEE Trans. Evolut. Comput. 2011, 15, 444–455. [Google Scholar] [CrossRef]
Schaffer, J.D. Multiple Objective Optimization with Vector Evaluated Genetic Algorithms. Ph.D. Thesis, Vanderbilt University, Nashville, TN, USA, 1984. [Google Scholar]
Amini, A.; Tavakkoli-Moghaddam, R. A bi-objective truck scheduling problem in a cross-docking center with probability of breakdown for trucks. Comput. Ind. Eng. 2016, 96, 180–191. [Google Scholar] [CrossRef]
Li, M.W.; Hong, W.C.; Geng, J.; Wang, J. Berth and quay crane coordinated scheduling using multi-objective chaos cloud particle swarm optimization algorithm. Neural Comput. Appl. 2017, 28, 3163–3182. [Google Scholar] [CrossRef]
Dulebenets, M.A. A comprehensive multi-objective optimization model for the vessel scheduling problem in liner shipping. Int. J. Prod. Econ. 2018, 196, 293–318. [Google Scholar] [CrossRef]
Goodarzi, A.H.; Nahavandi, N.; Hessameddin, S. A multi-objective imperialist competitive algorithm for vehicle routing problem in cross-docking networks with time windows. J. Ind. Syst. Eng. 2018, 11, 1–23. [Google Scholar]
Venturini, G.; Iris, C.; Kontovas, C.A.; Larsen, A. The multi-port berth allocation problem with speed optimization and emission considerations. Transp. Res.Part D Transp. Environ. 2017, 54, 142–159. [Google Scholar] [CrossRef]
Chargui, T.; Bekrar, A.; Reghioui, M.; Trentesaux, D. Multi-objective sustainable truck scheduling in a rail-road physical internet cross-docking hub considering energy consumption. Sustainability 2019, 11, 3127. [Google Scholar] [CrossRef]
Fliege, J.; Graña, L.M.; Svaiter, B.F. Newton’s method for multiobjective optimization. SIAM J. Opt. 2009, 20, 602–626. [Google Scholar] [CrossRef]
Das, I.; Dennis, J. Normal-boundary intersection: A new method for generating the Pareto surface in nonlinear multicriteria optimization problems. SIAM J. Opt. 1998, 8, 631–657. [Google Scholar] [CrossRef]
Eichfelder, G. Adaptive Scalarization Methods in Multiobjective Optimization; Springer: Berlin Heidelberg, Germany, 2008. [Google Scholar]
Fliege, J. Gap-free computation of Pareto-points by quadratic scalarizations. Math. Methods Operat. Res. 2004, 59, 69–89. [Google Scholar] [CrossRef]
Pereyra, V. Fast computation of equispaced Pareto manifolds and Pareto fronts for multiobjective optimization problems. Math. Comput. Simul. 2009, 79, 1935–1947. [Google Scholar] [CrossRef]
Wang, H. Zigzag search for continuous multiobjective optimization. INFORMS J. Comp. 2013, 25, 654–665. [Google Scholar] [CrossRef]
Martin, B.; Goldsztejn, A.; Granvilliers, L.; Jermann, C. Certified parallelotope continuation for one-manifolds. SIAM J. Numer. Anal. 2013, 51, 3373–3401. [Google Scholar] [CrossRef]
Pereyra, V.; Saunders, M.; Castillo, J. Equispaced Pareto front construction for constrained bi-objective optimization. Math. Comput. Model 2013, 57, 2122–2131. [Google Scholar] [CrossRef]
Martin, B.; Goldsztejn, A.; Granvilliers, L.; Jermann, C. On continuation methods for non-linear bi-objective optimization: Towards a certified interval-based approach. J. Glob. Optim. 2014, 64, 1–14. [Google Scholar] [CrossRef]
Schütze, O.; Martín, A.; Lara, A.; Alvarado, S.; Salinas, E.; Coello Coello, C.A. The directed search method for multiobjective memetic algorithms. J. Comput. Optim. Appl. 2016, 63, 305–332. [Google Scholar] [CrossRef]
Martín, A.; Schütze, O. Pareto Tracer: A predictor-corrector method for multi-objective optimization problems. Eng. Optim. 2018, 50, 516–536. [Google Scholar] [CrossRef]
Jahn, J. Multiobjective search algorithm with subdivision technique. Comput. Optim. Appl. 2006, 35, 161–175. [Google Scholar] [CrossRef]
Sun, J.Q.; Xiong, F.R.; Schütze, O.; Hernández, C. Cell Mapping Methods-Algorithmic Approaches and Applications; Springer: Singapore, 2019. [Google Scholar]
Deb, K. Multi-Objective Optimization Using Evolutionary Algorithms; John Wiley & Sons: Chichester, UK, 2001. [Google Scholar]
Coello Coello, C.A.; Lamont, G.B.; Van Veldhuizen, D.A. Evolutionary Algorithms for Solving Multi-Objective Problems, 2nd ed.; Springer: New York, NY, USA, 2007. [Google Scholar]
Sun, Y.; Gao, Y.; Shi, X. Chaotic multi-objective particle swarm optimization algorithm incorporating clone immunity. Mathematics 2019, 7, 146. [Google Scholar] [CrossRef]
Wang, P.; Xue, F.; Li, H.; Cui, Z.; Xie, L.; Chen, J. A multi-objective DV-hop localization algorithm based on NSGA-II in internet of things. Mathematics 2019, 7, 184. [Google Scholar] [CrossRef]
Pei, Y.; Yu, J.; Takagi, H. Search acceleration of evolutionary multi-objective optimization using an estimated convergence point. Mathematics 2019, 7, 129. [Google Scholar] [CrossRef]
Bullen, P.S. Handbook of Means and Their Inequalities; Vol. 560, Mathematics and its Applications; Kluwer Academic Publishers Group: Dordrecht, The Netherlands, 2003; p. xxviii+537. [Google Scholar]
Van Veldhuizen, D.A.; Lamont, G.B. Multiobjective evolutionary algorithm test suites. In Proceedings of the 1999 ACM symposium on Applied Computing, San Antonio, TX, USA, 28 February–2 March 1999; ACM: New York, NY, USA, 1999; pp. 351–357. [Google Scholar]
Coello Coello, C.A.; Cruz Cortés, N. Solving multiobjective optimization problems using an artificial immune system. Genet. Program. Evol. Mach. 2005, 6, 163–190. [Google Scholar] [CrossRef]
Rudolph, G.; Schütze, O.; Grimme, C.; Domínguez-Medina, C.; Trautmann, H. Optimal averaged Hausdorff archives for bi-objective problems: Theoretical and numerical results. Comput. Optim. Appl. 2016, 64, 589–618. [Google Scholar] [CrossRef]
Goldberg, M. Equivalence constants for ℓ_p norms of matrices. Linear Multilinear Algebra 1987, 21, 173–179. [Google Scholar] [CrossRef]
Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 2002, 6, 182–197. [Google Scholar] [CrossRef]
Zhang, Q.; Li, H. MOEA/D: A multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 2007, 11, 712–731. [Google Scholar] [CrossRef]

Figure 1. Left: the objectives

f_{1} (x) = x^{2}

and

f_{2} (x) = {(x - 2)}^{2}

from a multi-objective optimization problem (MOP; Equation (1)). Right: the corresponding Pareto set over the interval

[0, 2]

.

Figure 1. Left: the objectives

f_{1} (x) = x^{2}

and

f_{2} (x) = {(x - 2)}^{2}

from a multi-objective optimization problem (MOP; Equation (1)). Right: the corresponding Pareto set over the interval

[0, 2]

.

Figure 2. According to Corollary 1, when acting on disjoint subsets,

Δ_{p, q}

behaves as a proper metric if

(p, q)

lies in the blue sector, and according to Corollary 2, it behaves like an inframetric if

(p, q)

lies in the orange sectors.

Figure 2. According to Corollary 1, when acting on disjoint subsets,

Δ_{p, q}

behaves as a proper metric if

(p, q)

lies in the blue sector, and according to Corollary 2, it behaves like an inframetric if

(p, q)

lies in the orange sectors.

Figure 3. Different scenarios where the

{GD}_{p}

value of archive B is better (smaller) than the

{GD}_{p}

value of archive A independently of the Pareto set and where the additional assumptions made in Theorem 5 are easily verifiable.

Figure 3. Different scenarios where the

{GD}_{p}

value of archive B is better (smaller) than the

{GD}_{p}

value of archive A independently of the Pareto set and where the additional assumptions made in Theorem 5 are easily verifiable.

Figure 4. Two situations where

{IGD}_{p, q} (B)

is better (smaller) than

{IGD}_{p, q} (A)

for sufficiently negative q: Here, the hypotheses of Proposition 3 hold true.

Figure 4. Two situations where

{IGD}_{p, q} (B)

is better (smaller) than

{IGD}_{p, q} (A)

for sufficiently negative q: Here, the hypotheses of Proposition 3 hold true.

Figure 5. Four examples where

{IGD}_{p} (B)

is smaller (better) than

{IGD}_{p, q} (A)

for sufficiently negative q: In each case, at least one of the requirements of Theorem 6 is satisfied.

Figure 5. Four examples where

{IGD}_{p} (B)

is smaller (better) than

{IGD}_{p, q} (A)

for sufficiently negative q: In each case, at least one of the requirements of Theorem 6 is satisfied.

Figure 6. (Left) A situation where the Pareto front

F (P)

and the images

F (X)

and

F (Y)

of continuous archives satisfy

I_{p, q}^{GD} (X) ⩽ I_{p, q}^{GD} (Y)

and condition 1(a) of Theorem 8 holds true. (Right) A modification of the previous situation where conditions

(a^{'})

and

(b^{'})

of Remark 6 are satisfied but

I_{p, q}^{GD} (X) ≰ I_{p, q}^{GD} (Y)

. Here, there are no possible partitions of the archives satisfying part 1(a) of Theorem 8.

Figure 6. (Left) A situation where the Pareto front

F (P)

and the images

F (X)

and

F (Y)

of continuous archives satisfy

I_{p, q}^{GD} (X) ⩽ I_{p, q}^{GD} (Y)

and condition 1(a) of Theorem 8 holds true. (Right) A modification of the previous situation where conditions

(a^{'})

and

(b^{'})

of Remark 6 are satisfied but

I_{p, q}^{GD} (X) ≰ I_{p, q}^{GD} (Y)

. Here, there are no possible partitions of the archives satisfying part 1(a) of Theorem 8.

Figure 7. A hypothetical Pareto front discretization

P^{'}

(black circles) and two different archives:

X_{1}

(blue dots) and

X_{2}

(orange squares).

Figure 7. A hypothetical Pareto front discretization

P^{'}

(black circles) and two different archives:

X_{1}

(blue dots) and

X_{2}

(orange squares).

Figure 8. Optimal

Δ_{1, - 1}

archive A for the connected Pareto front

P_{1}

given by Equation (12) with 10 elements (blue circles) and at the right is the respective archive coordinates and the

Δ_{1, - 1}

distance.

Figure 8. Optimal

Δ_{1, - 1}

archive A for the connected Pareto front

P_{1}

given by Equation (12) with 10 elements (blue circles) and at the right is the respective archive coordinates and the

Δ_{1, - 1}

distance.

Figure 9. Optimal

Δ_{1, - 1}

archive A for the connected Pareto front

P_{2}

given by Equation (12) with 10 elements (blue circles) and at the right is the respective archive coordinates and the

Δ_{1, - 1}

distance.

Figure 9. Optimal

Δ_{1, - 1}

archive A for the connected Pareto front

P_{2}

given by Equation (12) with 10 elements (blue circles) and at the right is the respective archive coordinates and the

Δ_{1, - 1}

distance.

Figure 10. Optimal

Δ_{1, q}

five-point set archives A for the connected Pareto front

P_{1}

given by Equation (12) with

p = 1

and

q = \pm 1 / 2

.

Figure 10. Optimal

Δ_{1, q}

five-point set archives A for the connected Pareto front

P_{1}

given by Equation (12) with

p = 1

and

q = \pm 1 / 2

.

Figure 11. Optimal

Δ_{p, - 1}

one-point archives A for the connected Pareto front

P_{1}

given by Equation (12) with

q = - 1

and different values of p: In all cases, the archives are located in the line

x = y

.

Figure 11. Optimal

Δ_{p, - 1}

one-point archives A for the connected Pareto front

P_{1}

given by Equation (12) with

q = - 1

and different values of p: In all cases, the archives are located in the line

x = y

.

Figure 12. Numerical optimal

Δ_{1, - 1}

archive A for the disconnect step Pareto front

P_{3}^{(5)}

given by Equation (13) with 20 elements: here, we obtain

Δ_{1, - 1} (A, P_{3}^{(5, \frac{1}{10})}) = 0.111132

.

Figure 12. Numerical optimal

Δ_{1, - 1}

archive A for the disconnect step Pareto front

P_{3}^{(5)}

given by Equation (13) with 20 elements: here, we obtain

Δ_{1, - 1} (A, P_{3}^{(5, \frac{1}{10})}) = 0.111132

.

Figure 13. The black horizontal segment is the set A from Equation (14), and the blue piecewise map is the respective approximation given by the set

B_{δ}

from Equation (15) for two values of

δ

and

ε = 0.10

.

Figure 13. The black horizontal segment is the set A from Equation (14), and the blue piecewise map is the respective approximation given by the set

B_{δ}

from Equation (15) for two values of

δ

and

ε = 0.10

.

Figure 14. (Left) Pareto set. (Right) Pareto front of MOP (Equation (16)) for

n = 2

and

γ = 2

.

Figure 14. (Left) Pareto set. (Right) Pareto front of MOP (Equation (16)) for

n = 2

and

γ = 2

.

Figure 15. The same as in Figure 14 but for

γ = 1 / 2

.

Figure 15. The same as in Figure 14 but for

γ = 1 / 2

.

Figure 16. (Left) The blue dots A and the blue polygonal line B are the discrete and continuous approximations, respectively, for the Pareto set which corresponds to the orange thick segment, of MOP (Equation (16)) for

n = 2

. (Right) respective sets

F (A)

and

F (B)

of the Pareto front for

γ = 2

.

Figure 16. (Left) The blue dots A and the blue polygonal line B are the discrete and continuous approximations, respectively, for the Pareto set which corresponds to the orange thick segment, of MOP (Equation (16)) for

n = 2

. (Right) respective sets

F (A)

and

F (B)

of the Pareto front for

γ = 2

.

Figure 17. The same as in Figure 16 but for

γ = 1 / 2

.

Figure 17. The same as in Figure 16 but for

γ = 1 / 2

.

Figure 18. (Left) the blue dots A and the blue polygonal line are the discrete and continuous approximations, respectively, for the Pareto set and the orange thick segment is for the 410th generation of the NSGA-II algorithm of MOP (Equation (16)) for

n = 2

. (Right) corresponding sets

F (A)

and

F (B)

of the Pareto front for

γ = 2

.

Figure 18. (Left) the blue dots A and the blue polygonal line are the discrete and continuous approximations, respectively, for the Pareto set and the orange thick segment is for the 410th generation of the NSGA-II algorithm of MOP (Equation (16)) for

n = 2

. (Right) corresponding sets

F (A)

and

F (B)

of the Pareto front for

γ = 2

.

Figure 19. The same as in Figure 18 but for

γ = 1 / 2

.

Figure 19. The same as in Figure 18 but for

γ = 1 / 2

.

Figure 20. (Left) the blue dots A and the blue polygonal line are the discrete and continuous approximations, respectively, for the Pareto set and the orange thick segment is for the 410th generation of the MOEA/D algorithm of MOP (Equation (17)) for

n = 2

. (Right) corresponding sets

F (A)

and

F (B)

of the Pareto front for

γ = 2

.

Figure 20. (Left) the blue dots A and the blue polygonal line are the discrete and continuous approximations, respectively, for the Pareto set and the orange thick segment is for the 410th generation of the MOEA/D algorithm of MOP (Equation (17)) for

n = 2

. (Right) corresponding sets

F (A)

and

F (B)

of the Pareto front for

γ = 2

.

Figure 21. The same as in Figure 20 but for

γ = 1 / 2

.

Figure 21. The same as in Figure 20 but for

γ = 1 / 2

.

Figure 22. (Left) Pareto set. (Right) Pareto front of MOP (Equation (17)).

Figure 23. The black curve is the

Δ_{p, q}

value for the discrete approximation, and the blue one is the respective curve for the continuous approximation of NSGA-II for MOP (Equation (17)).

Figure 23. The black curve is the

Δ_{p, q}

value for the discrete approximation, and the blue one is the respective curve for the continuous approximation of NSGA-II for MOP (Equation (17)).

Figure 24. (Left) The blue dots and the blue polygon line are the discrete and continuous approximation, respectively, for the Pareto set of MOP (Equation (17)) in the 300th generation. (Right) respective sets

F (A)

and

F (B)

of the Pareto front.

Figure 24. (Left) The blue dots and the blue polygon line are the discrete and continuous approximation, respectively, for the Pareto set of MOP (Equation (17)) in the 300th generation. (Right) respective sets

F (A)

and

F (B)

of the Pareto front.

Figure 25. The same as in Figure 24 but for the 400th generation.

Figure 26. The same as in Figure 24 but for the 500th generation.

Table 1.

Δ_{p, q} (P^{'}, X_{1})

for several values of p and q.

Table 1.

Δ_{p, q} (P^{'}, X_{1})

for several values of p and q.

	1	2	5	10	20
q	1	2	5	10	20
$- \infty$	$0.9091$	$2.7153$	$5.5714$	$7.0811$	$7.9831$
$- 100$	$0.9272$	$2.7701$	$5.6839$	$7.2241$	$8.1443$
$- 20$	$0.9537$	$2.8367$	$5.8202$	$7.3974$	$8.3396$
$- 5$	$0.9895$	$2.8624$	$5.8705$	$7.4613$	$8.4117$
$- 1$	$1.1131$	$2.8782$	$5.8848$	$7.4795$	$8.4322$
1	$1.3243$	$2.9112$	$5.8920$	$7.4886$	$8.4425$
2	$2.9277$	$2.9295$	$5.8956$	$7.4932$	$8.4476$
5	$5.8920$	$5.8956$	$5.9063$	$7.5068$	$8.4630$
10	$7.4886$	$7.4932$	$7.5068$	$7.5292$	$8.4882$

Table 2.

Δ_{p, q} (P^{'}, X_{2})

for several values of p and q.

Table 2.

Δ_{p, q} (P^{'}, X_{2})

for several values of p and q.

	1	2	5	10	20
q	1	2	5	10	20
$- \infty$	$4.5412$	$4.5497$	$4.5751$	$4.6160$	$4.6867$
$- 100$	$4.6442$	$4.6529$	$4.6790$	$4.7209$	$4.7933$
$- 20$	$4.8425$	$4.8518$	$4.8795$	$4.9239$	$5.0003$
$- 5$	$4.9624$	$4.9720$	$5.0007$	$5.0465$	$5.1250$
$- 1$	$5.0008$	$5.0105$	$5.0394$	$5.0856$	$5.1646$
1	$5.0203$	$5.0301$	$5.0591$	$5.1055$	$5.1848$
2	$5.0301$	$5.0398$	$5.0690$	$5.1154$	$5.1949$
5	$5.0591$	$5.0690$	$5.0983$	$5.1450$	$5.2248$
10	$5.1055$	$5.1154$	$5.1450$	$5.1921$	$5.2725$

Table 3. Triangle inequality violations, in percentage, for several values of p and q: Here, we randomly chose 80 sets, each one containing 2 points in

{[0, 10]}^{2}

, and verified the triangle inequality for all possible set permutations (that is, 492,960).

Table 3. Triangle inequality violations, in percentage, for several values of p and q: Here, we randomly chose 80 sets, each one containing 2 points in

{[0, 10]}^{2}

, and verified the triangle inequality for all possible set permutations (that is, 492,960).

	1	2	5	10
q	1	2	5	10
$- 1$	$0.05396$	0	0	0
$- 2$	$0.10265$	$0.00041$	0	0
$- 5$	$0.28815$	$0.01217$	0	0
$- 10$	$0.35622$	$0.05031$	$0.00041$	0
$- 20$	$0.43046$	$0.08439$	$0.00446$	$0.00041$

Table 4.

Δ_{p, q}

results between the sets A and

B_{δ}

in Equations (14) and (15) for

ε = 0.10

and some parameter values of p, q, and

δ

.

Table 4.

Δ_{p, q}

results between the sets A and

B_{δ}

in Equations (14) and (15) for

ε = 0.10

and some parameter values of p, q, and

δ

.

p	q	$Δ_{pq} (A, B_{0.05})$	$Δ_{pq} (A, B_{0.10})$	$Δ_{pq} (A, B_{0.20})$	$Δ_{pq} (A, B_{0.40})$
1	1	$0.7149$	$0.7464$	$0.8091$	$0.9324$
1	1	$0.4105$	$0.4506$	$0.5311$	$0.6945$
1	100	$0.1503$	$0.1961$	$0.2878$	$0.4711$
1	200	$0.1479$	$0.1934$	$0.2844$	$0.4663$
1	$10, 000$	$0.1451$	$0.1901$	$0.2802$	$0.4602$

Table 5.

Δ_{p, q}

results for the approximations of the Pareto set and front for MOP (Equation (16)).

Table 5.

Δ_{p, q}

results for the approximations of the Pareto set and front for MOP (Equation (16)).

	p	q	Decision Space		Objective Space
	p	q	Finite Arch.	Cont. Arch.	Finite Arch.	Cont. Arch.
$γ = 2$	1	1	$0.5262$	$0.4775$	$0.4377$	$0.3851$
	1	1	$0.2710$	$0.2017$	$0.2051$	$0.1070$
	1	100	$0.1121$	$0.0341$	$0.0862$	$0.0040$
	1	200	$0.1112$	$0.0333$	$0.0855$	$0.0039$
	1	10,000	$0.1103$	$0.0324$	$0.0848$	$0.0038$
$γ = \frac{1}{2}$	1	1	$0.5262$	$0.4775$	$0.5520$	$0.4965$
	1	1	$0.2710$	$0.2017$	$0.2587$	$0.1120$
	1	100	$0.1121$	$0.0341$	$0.1079$	$0.0012$
	1	200	$0.1112$	$0.0333$	$0.1071$	$0.0012$
	1	10,000	$0.1103$	$0.0324$	$0.1062$	$0.0011$

Table 6. Parameter setting for NSGA-II and MOEA/D: Here, n denotes the dimension of the decision variable space.

Algorithm	Parameter	Value
NSGA-II	Population size	12
	Number of generations	500
	Crossover probability	0.8
	Mutation probability	$1 / n$
	Distribution index for crossover	20
	Distribution index for mutation	20
MOEA/D	Population size	12
	# weight vectors	12
	Number of generations	500
	Crossover probability	1
	Mutation probability	$1 / n$
	Distribution index for crossover	30
	Distribution index for mutation	20
	Aggregation function	Tchebycheff
	Neighborhood size	3

Table 7. For MOP (Equation (16)), the Table shows the

Δ_{p, q}

results for the finite and continuous Pareto front approximations. We used the NSGA-II generated archives for

p = 1

and

q = - 10

.

Table 7. For MOP (Equation (16)), the Table shows the

Δ_{p, q}

results for the finite and continuous Pareto front approximations. We used the NSGA-II generated archives for

p = 1

and

q = - 10

.

Generation	$γ = 1 / 2$		$γ = 2$
Generation	Finite Arch.	Cont. Arch.	Finite Arch.	Cont. Arch.
50	$0.0439$	$0.0147$	$0.0696$	$0.0160$
100	$0.0498$	$0.0109$	$0.0540$	$0.0102$
200	$0.0613$	$0.0118$	$0.0716$	$0.0207$
250	$0.0651$	$0.0265$	$0.0572$	$0.0061$
400	$0.0602$	$0.0102$	$0.0723$	$0.0276$
450	$0.0630$	$0.0154$	$0.0584$	$0.0088$
460	$0.0612$	$0.0154$	$0.0658$	$0.0098$
470	$0.0523$	$0.0102$	$0.0566$	$0.0083$
480	$0.0754$	$0.0269$	$0.0684$	$0.0241$
490	$0.0510$	$0.0091$	$0.0584$	$0.0118$
500	$0.0722$	$0.0097$	$0.0560$	$0.0103$

Table 8. For MOP (Equation (16)), the Table shows the

Δ_{p, q}

results for the finite and continuous Pareto front approximations. We used the MOEA/D generated archives for

p = 1

and

q = - 10

.

Table 8. For MOP (Equation (16)), the Table shows the

Δ_{p, q}

results for the finite and continuous Pareto front approximations. We used the MOEA/D generated archives for

p = 1

and

q = - 10

.

Generation	$γ = 1 / 2$		$γ = 2$
Generation	Finite Arch.	Cont. Arch.	Finite Arch.	Cont. Arch.
50	$0.0610$	$0.0171$	$0.0648$	$0.0119$
100	$0.0519$	$0.0051$	$0.1093$	$0.0016$
200	$0.0536$	$0.0037$	$0.0781$	$0.0009$
250	$0.0522$	$0.0037$	$0.0790$	$0.0008$
400	$0.0511$	$0.0017$	$0.0784$	$0.0009$
450	$0.0511$	$0.0017$	$0.0784$	$0.0009$
460	$0.0509$	$0.0012$	$0.0784$	$0.0009$
470	$0.0509$	$0.0012$	$0.0784$	$0.0009$
480	$0.0509$	$0.0010$	$0.0783$	$0.0009$
490	$0.0509$	$0.0010$	$0.0783$	$0.0009$
500	$0.0509$	$0.0010$	$0.0783$	$0.0009$

Table 9.

Δ_{p, q}

results between the Pareto Front and its respective discrete and continuous approximations of NSGA-II for MOP (Equation (17)): The data shown is the averaged over the 20 independent runs above.

Table 9.

Δ_{p, q}

results between the Pareto Front and its respective discrete and continuous approximations of NSGA-II for MOP (Equation (17)): The data shown is the averaged over the 20 independent runs above.

Generation	Continuous Archive	Finite Archive
20	$0.1333$	$0.2401$
40	$0.0176$	$0.1451$
60	$0.0090$	$0.1561$
80	$0.0088$	$0.1355$
100	$0.0065$	$0.1472$
120	$0.0074$	$0.1412$
140	$0.0081$	$0.1395$
160	$0.0075$	$0.1549$
180	$0.0092$	$0.1468$
200	$0.0074$	$0.1429$
220	$0.0066$	$0.1408$
240	$0.0075$	$0.1397$
260	$0.0066$	$0.1460$
280	$0.0074$	$0.1439$
300	$0.0084$	$0.1421$
320	$0.0070$	$0.1352$
340	$0.0070$	$0.1373$
360	$0.0081$	$0.1454$
380	$0.0079$	$0.1413$
400	$0.0066$	$0.1388$
420	$0.0063$	$0.1400$
440	$0.0097$	$0.1384$
460	$0.0067$	$0.1418$
480	$0.0067$	$0.1421$
500	$0.0076$	$0.1426$

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bogoya, J.M.; Vargas, A.; Schütze, O. The Averaged Hausdorff Distances in Multi-Objective Optimization: A Review. Mathematics 2019, 7, 894. https://doi.org/10.3390/math7100894

AMA Style

Bogoya JM, Vargas A, Schütze O. The Averaged Hausdorff Distances in Multi-Objective Optimization: A Review. Mathematics. 2019; 7(10):894. https://doi.org/10.3390/math7100894

Chicago/Turabian Style

Bogoya, Johan M., Andrés Vargas, and Oliver Schütze. 2019. "The Averaged Hausdorff Distances in Multi-Objective Optimization: A Review" Mathematics 7, no. 10: 894. https://doi.org/10.3390/math7100894

APA Style

Bogoya, J. M., Vargas, A., & Schütze, O. (2019). The Averaged Hausdorff Distances in Multi-Objective Optimization: A Review. Mathematics, 7(10), 894. https://doi.org/10.3390/math7100894

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Averaged Hausdorff Distances in Multi-Objective Optimization: A Review

Abstract

1. Introduction

2. Preliminaries

2.1. Multi-Objective Optimization

2.2. Finite Power Means

2.3. Integral Power Means in Measure Spaces

3. The p-Averaged Hausdorff Distance

4. The (p,q)-Averaged Hausdorff Distance

4.1. (p,q)-Distances between Finite Sets

4.2. $(p, q)$ -Distances between Measurable Sets

5. Metric Properties

6. The (p,q)-Distances as Quality Indicators

6.1. Pareto Compliance of $(p, q)$ -Indicators in the Finite Case

6.2. Pareto Compliance of $(p, q)$ -Indicators in the General Case

7. Examples and Numerical Experiments

7.1. Working with $Δ_{p, q}$ over Finite Sets

7.2. Optimal Archives for Discretized Spherical Pareto Fronts

7.3. Optimal Archives for Disconnected and Discretized Pareto Sets

7.4. General Example for Continuous Sets

7.5. Approximating Pareto Set and Front of a MOP

8. Conclusions and Perspectives

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

The Averaged Hausdorff Distances in Multi-Objective Optimization: A Review

Abstract

1. Introduction

2. Preliminaries

2.1. Multi-Objective Optimization

2.2. Finite Power Means

2.3. Integral Power Means in Measure Spaces

3. The p-Averaged Hausdorff Distance

4. The (p,q)-Averaged Hausdorff Distance

4.1. (p,q)-Distances between Finite Sets

4.2. ( p , q ) -Distances between Measurable Sets

5. Metric Properties

6. The (p,q)-Distances as Quality Indicators

6.1. Pareto Compliance of ( p , q ) -Indicators in the Finite Case

6.2. Pareto Compliance of ( p , q ) -Indicators in the General Case

7. Examples and Numerical Experiments

7.1. Working with Δ p , q over Finite Sets

7.2. Optimal Archives for Discretized Spherical Pareto Fronts

7.3. Optimal Archives for Disconnected and Discretized Pareto Sets

7.4. General Example for Continuous Sets

7.5. Approximating Pareto Set and Front of a MOP

8. Conclusions and Perspectives

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.2. $(p, q)$ -Distances between Measurable Sets

6.1. Pareto Compliance of $(p, q)$ -Indicators in the Finite Case

6.2. Pareto Compliance of $(p, q)$ -Indicators in the General Case

7.1. Working with $Δ_{p, q}$ over Finite Sets