Properties of Statistical Depth with Respect to Compact Convex Random Sets: The Tukey Depth

Luis González-De La Fuente; Alicia Nieto-Reyes; Pedro Terán

doi:10.3390/math10152758

,

and

¹

Departamento de Matemáticas, Estadística y Computación, Universidad de Cantabria, 39005 Santander, Spain

²

Departamento de Estadística e Investigación Operativa y Didáctica de las Matemáticas, Universidad de Oviedo, 33007 Oviedo, Spain

^*

Author to whom correspondence should be addressed.

^†

Current address: Facultad de Ciencias, Avd. de los Castros s/n, 39005 Santander, Spain.

Mathematics2022, 10(15), 2758;https://doi.org/10.3390/math10152758

This article belongs to the Special Issue Advances in Statistics: Theory, Methodology, Applications and Data Analysis

Version Notes

Order Reprints

Abstract

We study a statistical data depth with respect to compact convex random sets, which is consistent with the multivariate Tukey depth and the Tukey depth for fuzzy sets. In addition, it provides a different perspective to the existing halfspace depth with respect to compact convex random sets. In studying this depth function, we provide a series of properties for the statistical data depth with respect to compact convex random sets. These properties are an adaptation of properties that constitute the axiomatic notions of multivariate, functional, and fuzzy depth-functions and other well-known properties of depth.

Keywords:

compact convex set; halfspace depth; statistical depth function; symmetry

MSC:

62G30; 62G35

1. Introduction

In some real cases, statistical data appear in the form of sets, for instance, in the form of compact convex sets. Examples can be found in datasets related to health, such as the range of blood pressure over a day [1], or related to sport measures, such as the range of weights and heights of a soccer team [2]. Thanks to these phenomena having a convex compact set nature, it is possible to use some good properties of convex compact sets, for instance the existence of support functions. This type of statistical data is studied by the theory of random sets, which, from a statistical point of view, models observed phenomena that are sets rather than points in

R^{p}

, as in multivariate statistics, or functions, as in functional data analysis. Thus, a random set is a generalization of a random variable: it is a set-valued random variable. A random set can also be understood as a simplification of a fuzzy random variable, as the

α

-levels of a fuzzy set are nested compact sets. The literature about random sets contains well-established theoretical results [3], some of which are generalizations to random sets of classical statistical results, for instance, the strong law of large numbers [4]. Statistical methods are also part of the development of the area of compact convex random sets, such as proposing linear regression methods [5] or the median of a random interval [6]. Recent literature also includes theoretical results, such as results about the intersection of random sets [7], and applications, such as underwater sonar images [8].

Statistical depth functions have become a very useful tool in non-parametric statistics. Nowadays, depth functions are applied in different fields of statistics, such as clustering and classification [9] or real data analysis [10,11]. Given a distribution

P

in a space, a depth function,

D (\cdot; P)

, orders the elements in the space with respect to

P

. Roughly speaking, statistical depth functions measure how close an element is to a data cloud, in the sense that, if we move the element to the center of the cloud, its depth increases, and, if we move it out of the center, its depth decreases. Assuming it is unique, this center is the center of symmetry if the distribution is symmetric for a particular notion of symmetry. For multivariate spaces, there are notions of symmetry widely used in the literature: central, angular [12], and halfspace symmetry [13]. Notions of symmetry specific for functional [14] and fuzzy spaces [15] are, however, quite recent.

Formally, an axiomatic definition of the depth function for the multivariate case was proposed by Zuo and Serfling [13]. According to this definition, a depth function,

D (\cdot; P)

, satisfies the following properties. To introduce them, let X be a random variable with distribution

P

on

R^{n},

M_{n \times n} (R)

be the space of

n \times n

matrices with entries in

R

, and

∥ \cdot ∥

be the Euclidean norm. Abusing the notation, we indistinctly write

D (\cdot; X)

and

D (\cdot; P) .

M1.: Affine invariance. A depth function does not depend on the coordinate system, that is, for any non-singular $M \in M_{n \times n} (R)$ and $b \in R^{n}$ , $D (x; X) = D (M x + b; M X + b)$ .
M2.: Maximality at center. If the distribution $P$ has a uniquely defined center of symmetry, for a certain notion of symmetry $D (\cdot; X)$ is maximized at it.
M3.: Monotonicity relative to the deepest point. Let $x_{0} \in R^{n}$ be a point of maximal depth. Then, for any $x \in R^{n}$ , $D ((1 - λ) x_{0} + λ x; X) \geq D (x; X)$ for all $λ \in [0, 1]$ .
M4.: Vanishing at infinity. The limit of $D (x; X)$ goes to 0, as the limit of $∥ x ∥$ goes to infinity.

Formal axiomatic definitions of a depth function were later provided in the functional [16] and fuzzy settings [15,17].

The first instance of a depth function was proposed prior to the axiomatic definitions. It is the Tukey depth, an instance provided in 1975 by Tukey [18] for multivariate data, which is still the most well-known depth function. It is also known as halfspace depth, as it computes the infimum of the probabilities of closed halfspaces, which contain the point at which the depth function is evaluated. That is:

H D (x; P) : = inf {P (H) : H is a closed halfspace and x \in H} .

(1)

Zuo and Serfling [13] proved that

H D

satisfies M1-M4, and, therefore, it is a statistical depth function. We emphasize the satisfaction of the axioms, because it is customary in the statistical depth community not to consider the axioms as cut-off, regarding a function as a depth function, even when all the axioms are not satisfied in their entirety.

Since Tukey coined the term in 1975, many other instances of depth functions have been proposed, and their use in statistics has grown considerably. Some commonly used depth functions are the simplicial depth, proposed by Liu [12]; the spatial depth, proposed by Serfling [19]; and the random Tukey depth, proposed by Cuesta-Albertos and Nieto-Reyes [20], which, being based on random projections, is a computationally effective approximation of the Tukey depth. The spatial and random Tukey depth functions can be applied in both multivariate and functional spaces [21,22]. However, the random Tukey depth does not satisfy the axiomatic definition of a functional depth [16], which only the metric depth [14] has yet been proven to satisfy. It is worth noting that the spatial and random Tukey depth functions were introduced before the functional axiomatic definition in [16]. Furthermore, while the Tukey depth has not yet being defined in functional spaces, it has being generalized to the fuzzy setting and proved to satisfy the axiomatic definitions in that setting [15,23].

The aim of this paper is to propose some desirable properties of depth with respect to compact convex random sets, which can be considered to be an axiomatic definition for this setting. Some of these properties are an adaptation for compact convex sets of those proposed in González-De La Fuente et al. [15] for fuzzy data. The properties are also largely inspired by the multivariate definition [13] and, in addition, by the functional one [16], because the set of compact convex sets can be considered to be a metric space by using the Hausdorff distance, for instance. In order to test the viability of those properties, with a generalization of halfspaces suitable for the space of compact convex sets, we present an adaptation of Tukey depth and show that almost all of them are satisfied. These definitions of halfspace and Tukey depth can be regarded as stemming naturally from their corresponding multivariate definitions and, in addition, are a particular case of their fuzzy analogs [15]. Furthermore, we show that the definition of Tukey depth with respect to compact convex random sets coincides with that derived recently in Cascos et al. [24], which does not make an explicit use of halfspaces in its definition. The advantage of using our proposal is that it helps in proving some desirable properties of the Tukey depth, for instance the monotonicity relative to the deepest point (see proof of Proposition 3). In addition, it is clear that our proposal is a natural generalization of the multivariate halfspace depth, because it generalizes the concept of halfspace to the set of subsets of

R^{p}

. Moreover, we also show that the Tukey depth, with respect to compact convex random sets, can be rewritten in terms of the multivariate halfspace depth of the support function of compact convex sets.

The paper is organized as follows. The background about compact convex random sets is contained in Section 2. The definition of the Tukey depth with respect compact convex random sets is in Section 3, together with its relationships with and equivalences to other definitions. Section 4 presents and studies the properties of depth with respect to compact convex random sets and their satisfaction by the Tukey depth with respect to compact convex random sets. Section 5 includes a real-data analysis of compact convex sets in

R^{3} .

The paper concludes with some final remarks in Section 6.

2. Preliminaries on Compact Convex Random Sets

Let us denote using

K_{c} (R^{p})

the set of non-empty compact convex sets of

R^{p}

. In the case

p = 1

, the elements of

K_{c} (R)

are intervals of the form

[a, b]

with

a \leq b

. For any

K \in K_{c} (R^{p}),

its support function

s_{K} : S^{p - 1} \to R

is defined by

s_{K} (u) : = sup_{k \in K} ⟨ k, u ⟩,

where

⟨ \cdot, \cdot ⟩

denotes the usual dot product,

S^{p - 1} : = {x \in R^{p} : ∥ x ∥ = 1}

is the unit sphere, and

∥ \cdot ∥

is the Euclidean norm.

Let

(Ω, A, P)

be a probability space. A map

Γ : Ω \to K_{c} (R^{p})

is called a compact convex random set if

{ω \in Ω : Γ (ω) \cap K \neq \emptyset} \in A

for all

K \in K_{c} (R^{p})

[25]. Himmelberg [26] proved the Fundamental Measurability Theorem, which is useful to prove that

s_{Γ} (u)

is a real random variable for all

u \in S^{p - 1}

. As in the Euclidean space, in

K_{c} (R^{p})

there exists a predominant distance, the Hausdorff metric. The Hausdorff distance between

K \in K_{c} (R^{p})

and

L \in K_{c} (R^{p})

is

d_{H} (K, L) : = max {sup_{k \in K} inf_{l \in L} ∥ k - l ∥, sup_{l \in L} inf_{k \in K} ∥ k - l ∥},

which can be expressed in terms of their support function (e.g., [27]) as

d_{H} (K, L) = sup_{u \in S^{p - 1}} | s_{K} (u) - s_{L} (u) | .

(2)

The Borel measurability with respect to

d_{H}

is equivalent to the above-mentioned definition of compact convex random sets.

Some properties of the support functions of the elements of

K_{c} (R^{p})

can be deduced from the properties of the supremum function. For instance, let

K, L \in K_{c} (R^{p})

, taking into account that

K + L = {k + l : k \in K, l \in L} \in K_{c} (R^{p}),

we can conclude that the support function of

K + L

can be expressed as the sum of the support functions of K and L, that is,

s_{K + L} (u) = s_{K} (u) + s_{L} (u)

for all

u \in S^{p - 1}

. It is also possible to define the product of K by a scalar

γ \in R^{+}

, as

γ \cdot K = {γ k : k \in K} .

Then, it is clear that

s_{γ \cdot K} (u) = γ \cdot s_{K} (u)

for all

u \in S^{p - 1}

.

3. Halfspaces and Halfspace Depth in $K_{c} (R^{p})$

As is observable from (1), the Tukey depth of a multivariate point x is the infimum of the probability of halfspaces which contain x. However,

K_{c} (R^{p})

is not a linear space. In this section, we define generalized halfspaces (simply called halfspaces in the sequel) for

K_{c} (R^{p})

in a natural way from the multivariate case.

Let S be a halfspace of

R^{n}

. Then,

v \in R^{n}

and

b \in R

exist, such that

S = {y \in R^{n} : v^{T} y \leq b} .

Taking

u = (1 / ∥ v ∥) v \in S^{p - 1}

and

c = b / ∥ v ∥

, it is clear that

S = {y \in R^{n} : u^{T} y \leq c} .

Thus, the halfspaces of

R^{n}

can be viewed as subsets

S_{u, c} \subseteq R^{n}

, such that

S_{u, c} = {y \in R^{n} : u^{T} y \leq c}

with

u \in S^{p - 1}

and

c \in R

. This generalizes naturally to

K_{c} (R^{p})

by using the support function of a set. Thus, we define halfspaces

S_{u, t}^{-}, S_{u, t}^{+} \subseteq K_{c} (R^{p})

as

S_{u, t}^{-} : = {K \in K_{c} (R^{p}) : s_{K} (u) \leq t},

(3)

S_{u, t}^{+} : = {K \in K_{c} (R^{p}) : s_{K} (u) \geq t},

(4)

for all

u \in S^{p - 1}

and

t \in R

. We explicitly consider both halfspaces because

s_{K} (- u) = - inf_{k \in K} ⟨ u, k ⟩ \neq - s_{K} (u)

with

\begin{matrix} S_{u, t}^{+} & \subseteq S_{- u, - t}^{-}, \\ S_{u, t}^{-} & \subseteq S_{- u, - t}^{+}, \end{matrix}

for all

u \in S^{p - 1}

and

t \in R

.

Making use of both directions of the inequality that defines the halfspaces, the Tukey depth with respect to a compact convex random set can be defined. Let

Γ

be a compact convex random set. The Tukey depth of

K \in K_{c} (R^{p})

with respect to

Γ

is defined by the function

D_{C T} (\cdot; Γ) : K_{c} (R^{p}) \to [0, 1]

given by

D_{C T} (K; Γ) : = min {inf_{\underset{K \in S_{u, t}^{-}}{u \in S^{p - 1}, t \in R :}} P (Γ \in S_{u, t}^{-}), inf_{\underset{K \in S_{u, t}^{+}}{u \in S^{p - 1}, t \in R :}} P (Γ \in S_{u, t}^{+})} .

(5)

We indistinctively refer to it as the Tukey depth for compact convex random sets or the Tukey depth with respect to compact convex random sets. It is worth noting that (5) is a particularization for compact convex sets of the Tukey depth for fuzzy sets proposed in [15]; (3) and (4) are of the fuzzy halfspaces proposed there.

In what follows, we operate on (5) to show it coincides with the definition of halfspace depth with respect to compact convex random sets provided in Cascos et al. [24], which does not explicitly use halfspaces. From (3),

K \in S_{u, t}^{-}

means that

(u, t)

is a pair such that

s_{K} (u) \leq t

. Thus,

S_{u, s_{K} (u)}^{-} \subseteq S_{u, t}^{-}

and, consequently,

P (Γ \in S_{u, s_{K} (u)}^{-}) \leq P (Γ \in S_{u, t}^{-}) .

Analogously, from (4),

P (Γ \in S_{u, s_{K} (u)}^{+}) \leq P (Γ \in S_{u, t}^{+}) .

Taking the infimum in (5), we can express

D_{C T}

as

D_{C T} (K; Γ) = min {inf_{u \in S^{p - 1}} P (Γ \in S_{u, s_{K} (u)}^{-}), inf_{u \in S^{p - 1}} P (Γ \in S_{u, s_{K} (u)}^{+})} .

Making use of the definition of the halfspaces in (3) and (4), we have

D_{C T} (K; Γ) = min {inf_{u \in S^{p - 1}} P (s_{Γ} (u) \leq s_{K} (u)), inf_{u \in S^{p - 1}} P (s_{Γ} (u) \geq s_{K} (u))},

(6)

which coincides with the definition of the halfspace depth proposed by Cascos et al. [24].

Interchanging the minimum and infimum in (6),

D_{C T} (K; Γ) = inf_{u \in S^{p - 1}} min {P (s_{Γ} (u) \leq s_{K} (u)), P (s_{Γ} (u) \geq s_{K} (u))} .

(7)

Then, taking into account (1), we can express the Tukey depth for compact convex random sets in terms of the multivariate halfspace depth in the following way

D_{C T} (K; Γ) = inf_{u \in S^{p - 1}} H D (s_{K} (u); s_{Γ} (u)) .

(8)

Sample Halfspace Depth

We define the sample version

D_{C T, n}

of the Tukey depth for compact convex sets. Let

Γ : Ω \to K_{c} (R^{p})

be a compact convex random set associated with the probabilistic space

(Ω, A, P)

and

X_{1}, \dots, X_{n}

independent random sets distributed as

Γ

. We define the sample version of the Tukey depth

D_{C T, n}

as

D_{C T, n} (K; Γ) : = min {inf_{u \in S^{p - 1}} P_{n}^{u} ((- \infty, s_{K} (u)]), inf_{u \in S^{p - 1}} P_{n}^{u} ([s_{K} (u), \infty))},

(9)

for every

K \in K_{c} (R^{p})

, where

\begin{matrix} P_{n}^{u} ((- \infty, x]) & = \frac{1}{n} \cdot \sum_{i = 1}^{n} I (s_{X_{i}} (u) \in (- \infty, x]), \\ P_{n}^{u} ([x, \infty)) & = \frac{1}{n} \cdot \sum_{i = 1}^{n} I (s_{X_{i}} (u) \in [x, \infty)), \end{matrix}

for all

u \in S^{p - 1}

and

x \in R

. The function

D_{C T, n}

coincides with the sample version of the halfspace depth proposed by Cascos et al. [24]. Interchanging the minimum and infimum in (9), we also have that

D_{C T, n} (K; Γ) : = inf_{u \in S^{p - 1}} min {P_{n}^{u} ((- \infty, s_{K} (u)]), P_{n}^{u} ([s_{K} (u), \infty))} .

(10)

4. Properties of Depth for Compact Convex Sets

In this section, we propose some desirable properties for the depth for compact convex sets. They are mainly based on the properties that constitute the notion of the depth function for multivariate spaces [13], for functional (metric) spaces [16], and for the fuzzy setting [15]. Furthermore, we study whether

D_{C T}

satisfies them.

Some of these properties parallel the ones considered in [15], and, in certain cases, they follow for a random set

Γ

by applying the corresponding property in [15] to the indicator function

I_{Γ}

. However, this application is simplest for the properties whose direct proof is already very simple, which does not support the cost-effectiveness of doing so. In the longer proofs, additional arguments are needed, due, for instance, to the subtlety that the deepest point in the (larger) space of fuzzy sets might conceivably be deeper than the deepest non-fuzzy set. Therefore, the properties referring to deepest points are parallel in wording but might potentially have different content. It can be proved that this does not actually happen, but we also found that direct proofs make the paper more self-contained. Thus we opted for proofs which do not require the reader to be familiar with the specifics of fuzzy sets, by adapting the arguments in [15]. Still, some other properties in this section were not considered in [15].

4.1. Property 1: Affine Invariance

We focus on the M1. property of the multivariate case reported in the introduction. In the case of

K_{c} (R^{p})

, the product of

M \in M_{n \times n} (R)

times

K \in K_{c} (R^{p})

is defined as the compact convex set

\begin{matrix} M \cdot K = {M \cdot k : k \in K} . \end{matrix}

(11)

The affine invariance property that we propose is the following.

(P1.): Let Γ be a compact convex random set, $D (\cdot; Γ) : K_{c} (R^{p}) \to [0, \infty)$ a function. Then,

$D (M \cdot K + L; M \cdot Γ + L) = D (K; Γ),$

for all $M \in M_{n \times n} (R)$ non-singular matrix and any $K, L \in K_{c} (R^{p})$ .

Thus, this property is analogous to the multivariate case. The property in the fuzzy case is different only in that we need the Zadeh’s extension principle [28,29,30] to apply a matrix to a fuzzy set. The property for functional data also differs, since [16] demands isometry invariance. However, note that, in this context, affine invariance actually implies isometry invariance, since, as a result of Gruber and Lettl [31], all isometries of

K_{c} (R^{p})

are of the form

K \mapsto M \cdot K + L

with M orthogonal.

Proposition 1.

The function

D_{C T}

satisfies P1.

The following lemma (cf. [15], Proposition 8.2) is used to prove Proposition 1.

Lemma 1.

Let

K \in K_{c} (R^{p})

and

M \in M_{n \times n} (R)

a non-singular matrix. Then,

s_{M \cdot K} (u) = ∥ M^{T} \cdot u ∥ \cdot s_{K} ((1 / (∥ M^{T} \cdot u ∥)) \cdot M^{T} \cdot u)

for all

u \in S^{p - 1}

.

Proof.

Taking into account (11), it is clear that

s_{M \cdot K} (u) = sup_{v \in M \cdot K} ⟨ u, v ⟩ = sup_{k \in K} ⟨ u, M \cdot k ⟩ = sup_{k \in K} ⟨ M^{T} \cdot u, k ⟩,

for any

u \in S^{p - 1}

. In general,

M^{T} \cdot u

does not belong to

S^{p - 1}

. Thus, normalizing it, we have that

\begin{matrix} s_{M \cdot K} (u) = & sup_{k \in K} ⟨ ∥ M^{T} \cdot u ∥ \cdot \frac{1}{∥ M^{T} u ∥} \cdot M^{T} \cdot u, k ⟩ = \\ ∥ M^{T} \cdot u ∥ \cdot sup_{k \in K} ⟨ \frac{1}{∥ M^{T} \cdot u ∥} \cdot M^{T} \cdot u, k ⟩ = ∥ M^{T} \cdot u ∥ \cdot s_{K} (\frac{1}{∥ M^{T} \cdot u ∥} \cdot M^{T} \cdot u) . \end{matrix}

□

It is clear that, if

M \in M_{n \times n} (R)

is a non-singular matrix, the map

f : S^{p - 1} \to S^{p - 1}

defined by

f (u) = (1 / ∥ M^{T} \cdot u ∥) \cdot M^{T} \cdot u

is bijective. We make use of this to prove Proposition 1.

Proof of Proposition 1.

Using the properties of the support function and Lemma 1, we obtain

s_{M \cdot K + L} (u) = ∥ M^{T} \cdot u ∥ \cdot s_{K} (\frac{1}{∥ M^{T} \cdot u ∥} \cdot M^{T} \cdot u) + s_{L} (u),

for all

u \in S^{p - 1}

. From (6), we have that

\begin{matrix} inf_{u \in S^{p - 1}} P (s_{M \cdot Γ + L} (u) \leq s_{M \cdot K + L} (u)) = inf_{u \in S^{p - 1}} P (s_{M \cdot Γ} (u) \leq s_{M \cdot K} (u)) = \\ inf_{u \in S^{p - 1}} P (s_{Γ} (\frac{1}{∥ M^{T} \cdot u ∥} \cdot M^{T} \cdot u) \leq s_{A} (\frac{1}{∥ M^{T} \cdot u ∥} \cdot M^{T} \cdot u)) = \\ inf_{u \in S^{p - 1}} P (s_{Γ} (u) \leq s_{K} (u)) \end{matrix}

where the last equality follows from the fact that f is bijective. □

4.2. Property 2: Maximality at the Center of Symmetry

In this case, the property is the same for multivariate, functional, and fuzzy settings, but for the fact that the notion of symmetry applied has to be defined in the corresponding space. In the multivariate case, several notions of symmetry exist, for instance central, angular, and halfspace symmetry [12,13]. In the functional case, one proved to be topologically valid exists [10,14], while there have been two proposals in the fuzzy setting [15]. To propose a notion of symmetry in

K_{c} (R^{p}),

we make use of the central symmetry notion and of the support function of compact convex random sets. A random variable X on

R^{p}

is centrally symmetric (or C-symmetric) with respect to

x \in R^{p}

if

X - x

and

x - X

are equally distributed.

Definition 1.

Let Γ be a compact convex random set. We say that Γ is compact-symmetric with respect to K if

s_{Γ} (u)

is C-symmetric with respect to

s_{K} (u)

for all

u \in S^{p - 1}

.

We propose the following property.

(P2.): Let Γ be a compact convex random set which is symmetric (for a certain notion of symmetry) with respect to $K \in K_{c} (R^{p})$ . Let $D (\cdot; Γ) : K_{c} (R^{p}) \to [0, \infty)$ be a function. Then

$D (K; Γ) = sup_{L \in K_{c} (R^{p})} D (L; Γ) .$

Thus, this property is analogous in the multivariate, functional, and fuzzy cases. The only difference is the notion of symmetry defined for each case. Note that the above defined notion of symmetry for compact convex random sets, which makes use of C-symmetry, is also an adaptation of the F-symmetry [15] of the fuzzy case, based on support functions. It is possible to consider another notion of symmetry for random sets by identifying every set with its support function and considering central symmetry in the function space. However, our notion is more general, which makes it a natural choice.

With the above notion of compact-symmetry, we have the following result.

Proposition 2.

The function

D_{C T}

satisfies P2.

Proof.

By hypothesis, let us suppose that

Γ

is compact-symmetric with respect to K. By definition, we have that the real random variable

s_{Γ} (u)

is C-symmetric with respect to

s_{K} (u)

for all

u \in S^{p - 1}

. This means that

s_{K} (u) \in Med (s_{Γ} (u))

for all

u \in S^{p - 1}

, where

Med (\cdot)

denotes the univariate median. It implies that

P (s_{Γ} (u) \leq s_{K} (u)) \geq 1 / 2 and P (s_{Γ} (u) \geq s_{K} (u)) \geq 1 / 2 .

Using the expression of

D_{C T}

in Equation (6), we have that

D_{C T} (\cdot; Γ)

is maximized in K. □

4.3. Property 3: Monotonicity with Respect to the Center

In the multivariate case [13], this property is understood in an algebraic way, as the convex combinations between the element of maximal depth and another point are considered. As the operations of sum and product by a scalar are defined in

K_{c} (R^{p})

, we can propose the same property.

(P3a.): Let Γ be a compact convex random set and let $K \in K_{c} (R^{p})$ maximize $D (\cdot; Γ)$ . Then,

$D ((1 - λ) \cdot K + λ \cdot L; Γ) \geq D (L; Γ)$

for all $λ \in [0, 1]$ and $L \in K_{c} (R^{p})$ .

Additionally, this property is analogous to property P3a. in the definition of semi-linear depth in the fuzzy setting [15].

In the functional (metric) case, a different property was proposed by (Nieto-Reyes and Battey [16], Property P-3.) which explicitly uses the metric in the space. We can see

K_{c} (R^{p})

as a metric space with the Hausdorff metric

d_{H}

. Thus, another possible property is the following.

(P3b.): Let Γ be a compact convex random set, d be a metric in $K_{c} (R^{p})$ , and $K, L, S \in K_{c} (R^{p})$ be three sets such that K maximizes $D (\cdot; Γ)$ and $d (K, S) = d (K, L) + d (L, S)$ . Then,

$D (L; Γ) \geq D (S; Γ) .$

This property is analogous to property P3b. in the definition of geometric depth in the fuzzy setting [15].

For these two possible translations of the multivariate property, we have the following two results.

Proposition 3.

The function

D_{C T}

satisfies P3a.

Proof.

Let

Γ

be a compact convex random set, and let

K, L \in K_{c} (R^{p})

be two sets such that K maximizes

D_{C T} (\cdot; Γ)

. Using the properties of the support function of a set, we have that

s_{(1 - λ) \cdot K + λ \cdot L} (u) = (1 - λ) s_{K} (u) + λ s_{L} (u)

for all

u \in S^{p - 1}

and

λ \in [0, 1]

.

We consider the set

K = {(u, t) \in S^{p - 1} \times R : (1 - λ) \cdot K + λ \cdot L \in S_{u, t}^{-}} .

It can be expressed as

K_{1} \cup K_{2} \cup K_{3}

, where

\begin{matrix} K_{1} & = {(u, t) \in S^{p - 1} \times R : K, L \in S_{u, t}^{-}, L \in S_{u, t}^{-}}, \\ K_{2} & = {(u, t) \in S^{p - 1} \times R : K \in S_{u, t}^{-}, L \notin S_{u, t}^{-}, (1 - λ) \cdot K + λ \cdot L \in S_{u, t}^{-}}, \\ K_{3} & = {(u, t) \in S^{p - 1} \times R : K \notin S_{u, t}^{-}, L \in S_{u, t}^{-}, (1 - λ) \cdot K + λ \cdot L \in S_{u, t}^{-}} . \end{matrix}

It is clear that they are disjoint sets. Thus, we have that

\begin{matrix} inf_{\underset{(1 - λ) \cdot K + λ \cdot L \in S_{u, t}^{-}}{u \in S^{p - 1}, t \in R :}} P (Γ \in S_{u, t}^{-}) = inf_{(u, t) \in K} P (Γ \in S_{u, t}^{-}) = \\ min {inf_{(u, t) \in K_{1}} P (Γ \in S_{u, t}^{-}), inf_{(u, t) \in K_{2}} P (Γ \in S_{u, t}^{-}), inf_{(u, t) \in K_{3}} P (Γ \in S_{u, t}^{-})} . \end{matrix}

(12)

Taking into account that

K_{1}, K_{2} \subseteq {(u, t) \in S^{p - 1} \times R : K \in S_{u, t}^{-}}

and

K_{3} \subseteq {(u, t) \in S^{p - 1} \times R : L \in S_{u, t}^{-}},

it is obtained that

\begin{matrix} inf_{(u, t) \in K_{1}} P (Γ \in S_{u, t}^{-}) & \geq inf_{\underset{K \in S_{u, t}^{-}}{u \in S^{p - 1}, t \in R :}} P (Γ \in S_{u, t}^{-}) \geq D_{C T} (K; Γ), \\ inf_{(u, t) \in K_{2}} P (Γ \in S_{u, t}^{-}) & \geq inf_{\underset{K \in S_{u, t}^{-}}{u \in S^{p - 1}, t \in R :}} P (Γ \in S_{u, t}^{-}) \geq D_{C T} (K; Γ), \\ inf_{(u, t) \in K_{3}} P (Γ \in S_{u, t}^{-}) & \geq inf_{\underset{L \in S_{u, t}^{-}}{u \in S^{p - 1}, t \in R :}} P (Γ \in S_{u, t}^{-}) \geq D_{C T} (L; Γ) . \end{matrix}

(13)

Using (12) and (13) and taking into account that K maximizes

D_{C T}

, we have that

inf_{\underset{(1 - λ) \cdot K + λ \cdot L \in S_{u, t}^{-}}{u \in S^{p - 1}, t \in R :}} P (Γ \in S_{u, t}^{-}) \geq D_{C T} (L; Γ) .

Analogously, we obtain

inf_{\underset{(1 - λ) \cdot K + λ \cdot L \in S_{u, t}^{+}}{u \in S^{p - 1}, t \in R :}} P (Γ \in S_{u, t}^{+}) \geq D_{C T} (L; Γ) .

Thus,

D_{C T} ((1 - λ) \cdot K + λ L; Γ) \geq D_{C T} (L; Γ)

, and

D_{C T}

satisfies property P3a. □

Proposition 4.

The function

D_{C T}

does not satisfy P3b with respect to the distance

d_{H}

.

Proof.

The proof is by counterexample. Let

({ω_{1}, ω_{2}}, P ({ω_{1}, ω_{2}}), P)

be a probabilistic space such that

P (ω_{1}) = 3 / 4 and P (ω_{2}) = 1 / 4 .

We consider the compact convex random set

Γ : Ω \to K_{c} (R)

defined by

Γ (ω_{1}) = [1, 2] and Γ (ω_{2}) = [2, 7] .

It is clear that

D_{C T} (Γ (ω_{1}); Γ) = 3 / 4,

and it is the set which maximizes

D_{C T}

. Let us consider

L = [3, 5]

. We have that

5 = d_{H} (Γ (ω_{1}), Γ (ω_{2})) = d_{H} (Γ (ω_{1}), L) + d_{H} (Γ (ω_{2}), L) = 3 + 2 .

Moreover,

D_{C T} (Γ (ω_{2}); Γ) = 1 / 4 and D_{C T} (L; Γ) = P (s_{Γ} (- 1) \leq s_{C} (- 1)) = 0 .

Thus,

D_{C T}

violates property P3b. □

Notice that the Tukey depth may satisfy Property P3b if the distances between the sets are not measured with the Hausdorff metric, e.g., in the

L^{p}

-type metrics introduced by Vitale [32].

4.4. Property 4: Vanishing at Infinity

The property in the multivariate case is understood in a geometrical way, considering a sequence

{x_{n}}_{n}

such that

∥ x_{n} ∥ \to \infty

[13]. We can also consider a sequence

{a + n b}_{n}

with

a, b \in R^{p}

, such that

b \neq 0

, and suppose that the sequence of distances diverges. Thus, in this setting, we also propose two possible properties, the first one from an algebraic point of view and the second one taking into account that the set

K_{c} (R^{p})

can be viewed as a metric space using the Hausdorff distance.

(P4a.): Let Γ be a compact convex random set, and let $K, L \in K_{c} (R^{p})$ be two sets such that K maximizes $D (\cdot; Γ)$ and $L \neq {0}$ . Then,

$lim_{n} D (K + n \cdot L; Γ) = 0 .$
(P4b.): Let Γ be a compact convex random set, d a metric in $K_{c} (R^{p})$ , $K \in K_{c} (R^{p})$ a set that maximizes $D (\cdot; Γ)$ and ${K_{n}}_{n}$ a sequence of elements of $K_{c} (R^{p})$ such that ${lim}_{n} d (K, K_{n}) = \infty$ . Then,

$lim_{n} D (K_{n}; Γ) = 0 .$

Property P4a. parallels the fourth property of the semi-linear depth for fuzzy sets, while P4b. parallels the fourth property of geometric depth for fuzzy sets.

Concerning those properties, we have the following results.

Proposition 5.

The function

D_{C T}

satisfies P4a. and P4b. with respect to the distance

d_{H}

.

The following proposition is used in the proof of Proposition 5 for property P4b.

Proposition 6.

Let

{K_{n}}_{n}

be a sequence of elements of

K_{c} (R^{p})

such that

{lim}_{n} d_{H} (K_{n}; {0}) = \infty

. Then, there exists

u \in S^{p - 1}

such that

lim_{n} s_{K_{n}} (u) = \infty .

Proof.

It is a basic property of the Hausdorff distance that

d_{H} (K_{n}, {0}) = sup {∥ x ∥ : x \in K_{n}}

for all

n \in N

. The function

f_{n} : K_{n} \to R

defined by

f_{n} (x) = ∥ x ∥

is a continuous function defined over a compact convex set, thus it attains its maximum on

K_{n}

, for all

n \in N

. Let us denote by

x_{n}

the point of

K_{n}

where

f_{n}

attains its maximum for every

n \in N

. By hypothesis we have that

lim_{n} ∥ x_{n} ∥ = \infty .

It implies that there exists

u \in S^{p - 1}

such that

lim_{n} ⟨ u, x_{n} ⟩ = \infty .

By definition of the support function of a compact convex set, we have that

⟨ u, x_{n} ⟩ \leq s_{K_{n}} (u) .

Thus,

{lim}_{n} s_{K_{n}} (u) = \infty

. □

Proof of Proposition 5.

Property P4a. Let

L \neq {0}

. There exists

u_{0} \in S^{p - 1}

such that

s_{L} (u_{0}) \neq 0 .

Without loss of generality, we assume

s_{L} (u_{0}) > 0

. Clearly, the sequence

{s_{K} (u_{0}) + n \cdot s_{L} (u_{0})}_{n}

is such that

lim_{n} s_{K} (u_{0}) + n \cdot s_{L} (u_{0}) = \infty .

We have that

D_{C T} (K + \cdot L; Γ) \leq P (s_{Γ} (u_{0}) \geq s_{K} (u_{0}) + n \cdot s_{L} (u_{0})) .

If we take limits on both sides

lim_{n \to \infty} D_{C T} (K + n \cdot L; Γ) \leq lim_{n \to \infty} P (s_{Γ} (u_{0}) \geq s_{K} (u_{0}) + n \cdot s_{L} (u_{0})) = 0 .

Using Sandwhich’s Rule, we have that

{lim}_{n} D_{C T} (K + n \cdot L; Γ) = 0

.

Property P4b. As the set K is fixed, the condition

lim_{n} d_{H} (K, K_{n}) = \infty

is equivalent to

lim_{n} d_{H} (K_{n}, {0}) = \infty .

From Proposition 6, we have that there exists

u_{0} \in S^{p - 1}

such that

lim_{n} s_{K_{n}} (u) = \infty .

The rest of the proof is analogous to that of Property P4a. □

4.5. Property 5: Upper Semi-Continuity

This property regards a depth as an upper semi-continuous function at every point of its domain. In the multivariate case it is not considered to be a canonical requirement, but continuity properties are studied in different papers, for instance in [13]. This property is considered in the definition of the depth function for functional (metric) spaces [16]. According to [16], a depth

D,

of a metric space

(E, d)

with respect to a distribution

P

in the space, is upper semi-continuous if, for all

x \in E

and for all

ε > 0

, there exists

δ > 0

such that

sup_{y : d (x, y) < δ} D (y; P) \leq D (x; P) .

The property has not yet being considered in the fuzzy setting.

(P5.): Let Γ be a compact convex random set, and d be a metric defined over $K_{c} (R^{p})$ . The function $D (\cdot; Γ)$ is upper semi-continuous with respect to the distance d in the sense that

$lim sup_{n} D (K_{n}; Γ) \leq D (K; Γ)$

for every set $K \in K_{c} (R^{p})$ and every sequence of sets ${K_{n}}_{n}$ such that $l i m_{n} d (K, K_{n}) = 0 .$

Notice that upper semi-continuity implies that the contours of the depth function are closed sets.

Proposition 7.

The function

D_{C T}

satisfies P5. with respect to the distance

d_{H}

.

Proof.

Let

Γ

be a compact convex random set and

K \in K_{c} (R^{p})

be a set, and let

{K_{n}}_{n}

be a sequence of compact convex sets such that

lim_{n \to \infty} d_{H} (K, K_{n}) = 0 .

We need to prove

lim sup_{n} D_{C T} (K_{n}; Γ) \leq D_{C T} (K; Γ) .

From (2),

d_{H} (K, K_{n}) = sup_{u} | s_{K} (u) - s_{K_{n}} (u) |

and then

lim_{n \to \infty} | s_{K} (u) - s_{K_{n}} (u) | = 0

for each

u \in S^{p - 1}

. Thus

lim_{n \to \infty} s_{K_{n}} (u) = s_{K} (u)

for every

u \in S^{p - 1}

. Without loss of generality (the other case is analogous), assume

D_{C T} (K; Γ) = inf_{u} P (s_{K} (u) \leq s_{Γ} (u)) .

Now, we prove that, for all

u \in S^{p - 1}

,

U : = {ω \in Ω : \forall k \in N, \exists n \geq k : ω \in {s_{K_{n}} (u) \leq s_{Γ} (u)}} \subseteq {ω \in Ω : s_{K} (u) \leq s_{Γ (ω)} (u)} .

Let

ω \in U

. There exists a sub-sequence

{K_{n^{'}}}_{n}

of

{K_{n}}_{n}

such that

s_{K_{n^{'}}} (u) \leq s_{Γ (ω)} (u)

for all

n^{'}

. Taking limits,

s_{K} (u) = lim_{n^{'} \to \infty} s_{K_{n^{'}}} (u) \leq s_{Γ (ω)} (u),

therefore

ω \in {ω \in Ω : s_{K} (u) \leq s_{Γ} (u)} .

By definition,

U = {lim sup}_{n} {s_{K_{n}} (u) \leq s_{Γ} (u)}

. Thus

\begin{matrix} D_{C T} (K; Γ) & = P (s_{K} (u) \leq s_{Γ} (u)) \geq P (\underset{n}{lim sup} {s_{K_{n}} (u) \leq s_{Γ} (u)}) \geq \\ \geq \underset{n \to \infty}{lim sup} P (s_{K_{n}} (u) \leq s_{Γ} (u)) \end{matrix}

(14)

where the second inequality is due to the Fatou’s lemma. Taking the infimum on both sides yields

inf_{u} P (s_{K} (u) \leq s_{Γ} (u)) \geq inf_{u} lim sup_{n \to \infty} P (s_{K_{n}} (u) \leq s_{Γ} (u)) .

(15)

Since

lim sup_{n} P (s_{K_{n}} (u) \leq s_{Γ} (u)) = inf_{n} sup_{k \geq n} P (s_{K_{k}} (u) \leq s_{Γ} (u)),

it is clear that

\begin{matrix} inf_{u} inf_{n} sup_{k \geq n} P (s_{K_{k}} (u) \leq s_{Γ} (u)) \geq inf_{n} sup_{k \geq n} inf_{u} P (s_{K_{k}} (u) \leq s_{Γ} (u)) = \\ = lim sup_{n \to \infty} inf_{u} P (s_{K} (u) \leq s_{Γ} (u)) \geq lim sup_{n \to \infty} D_{C T} (K_{n}; Γ) . \end{matrix}

(16)

From (14)–(16),

D_{C T} (\cdot; Γ)

is upper semi-continuous. □

4.6. Property 6: Consistency

Another desirable property for depth functions is that the sample version converges to the population counterpart (consistency). This property is a particular case of the weak continuity (as a function of the distribution P) property of the axiomatic functional (metric) notion of depth [16], but it is not part of the axiomatic notions of multivariate and fuzzy depth. However, it is generally studied when an instance of depth function is introduced. To the best of our knowledge, the first time that appeared in the literature for depth functions was in Liu [12].

We propose the following property.

(P6.): Let Γ be a compact convex random set, $D (\cdot; Γ) : K_{c} (R^{p}) \to [0, \infty)$ a function, and $D_{n} (\cdot; Γ) K_{c} (R^{p}) \to [0, \infty)$ its sample version. Then, D and $D_{n}$ satsify

$sup_{K \in K_{c} (R^{p})} | D (K; Γ) - D_{n} (K; Γ) | ⟶ 0, a . s . [P] .$

This is a uniform consistency requirement which is satisfied by the Tukey depth, but the uniformity may eventually have to be dropped for other depth functions.

Theorem 2.

The function

D_{C T},

with

D_{C T, n}

in (9), satisfies P6.

Proof.

In terms of measurability, we have that

s_{X_{1}} (u), \dots, s_{X_{n}} (u)

is a random sample of the random variable

s_{Γ} (u)

for all

u \in S^{p - 1}

. Let us fix

K \in K_{c} (R^{p})

. To ease the notation, let us denote

F (s_{K} (u)) : = {P (s_{Γ} (u) \leq s_{K} (u)), P (s_{Γ} (u) \geq s_{K} (u))},

F_{n} (s_{K} (u)) : = {P_{n}^{u} ((- \infty, s_{K} (u)]), P_{n}^{u} ([s_{K} (u), \infty))} .

From (7) and (10) and basic properties of the supremum and infimum functions, we have that

\begin{matrix} | D_{C T} (K; Γ) - D_{C T, n} (K; Γ) | & = | inf_{u \in S^{p - 1}} min F (s_{K} (u)) - inf_{u \in S^{p - 1}} min F_{n} (s_{K} (u)) | \\ \leq sup_{u \in S^{p - 1}} | min F (s_{K} (u)) - min F_{n} (s_{K} (u)) | . \end{matrix}

Step 1. Setting

\begin{matrix} F^{+} (t, u) & : = P (s_{Γ} (u) \leq t) \\ F^{-} (t, u) & : = P (s_{Γ} (u) \geq t) \\ F_{n}^{+} (t, u) & : = P_{n}^{u} ((- \infty, t]) \\ F_{n}^{-} (t, u) & : = P_{n}^{u} ([t, \infty)) \end{matrix}

and applying these again basic properties, we obtain

\begin{matrix} | D_{C T} (K; Γ) - D_{C T, n} (K; Γ) | \leq & sup_{u \in S^{p - 1}} max {| F^{+} (s_{K} (u), u) - F_{n}^{+} (s_{K} (u), u) |, \\ | F^{-} (s_{K} (u), u) - F_{n}^{-} (s_{K} (u), u) |} . \end{matrix}

Then

\begin{matrix} sup_{K \in K_{c} (R^{p})} & | D_{C T} (K; Γ) - D_{C T, n} (K; Γ) | \leq \\ \leq sup_{K \in K_{c} (R^{p})} sup_{u \in S^{p - 1}} max {| F^{+} (s_{K} (u), u) - F_{n}^{+} (s_{K} (u), u) |, \\ | F^{-} (s_{K} (u), u) - F_{n}^{-} (s_{K} (u), u) |} \\ \leq sup_{u \in S^{p - 1}} sup_{t \in R} max {| F^{+} (t, u) - F_{n}^{+} (t, u) |, | F^{-} (t, u) - F_{n}^{-} (t, u) |} . \end{matrix}

The Dvoretzky–Kiefer–Wolfowitz inequality ([33], Corollary 1) gives, for each

u \in S^{p - 1}

and

ε > 0

,

P (sup_{t \in R} | F^{+} (t, u) - F_{n}^{+} (t, u) | > ε) \leq 2 exp {- 2 ε^{2} n}

and there easily follows

P (sup_{t \in R} | F^{-} (t, u) - F_{n}^{-} (t, u) | > ε) \leq 2 exp {- 2 ε^{2} n} .

Since the bound is independent of u, that implies

P (sup_{u \in S^{p - 1}} sup_{t \in R} max {| F^{+} (t, u) - F_{n}^{+} (t, u) |, | F^{-} (t, u) - F_{n}^{-} (t, u) |} > ε) \leq 4 exp {- 2 ε^{2} n}

which, by the arbitrariness of

ε

, establishes

sup_{u \in S^{p - 1}} sup_{t \in R} max {| F^{+} (t, u) - F_{n}^{+} (t, u) |, | F^{-} (t, u) - F_{n}^{-} (t, u) |} \to 0

in probability.

Step 2. To prove almost sure convergence, we rewrite the supremum in terms of an empirical process. Taking

F = {ϕ_{t, u}^{+}, ϕ_{t, u}^{-} ∣ (t, u) \in R \times S^{p - 1}},

where

ϕ_{t, u}^{+}, ϕ_{t, u}^{-} : Ω \to R

are given by

ϕ_{t, u}^{+} (ω) = I_{(- \infty, t] (s_{Γ (ω)})}, ϕ_{t, u}^{-} (ω) = I_{[t, \infty) (s_{Γ (ω)})},

we have

sup_{u \in S^{p - 1}} sup_{t \in R} max {| F^{+} (t, u) - F_{n}^{+} (t, u) |, | F^{-} (t, u) - F_{n}^{-} (t, u) |} = sup_{ϕ \in F} | E_{P_{n}} (ϕ) - E_{P} (ϕ) |,

where

P_{n}

is the empirical distribution. From ([34], Corollary 3.7.9), the above supremum converges to 0 almost surely because it does so in probability (which was proved in Step 1), and the family

F

has a P-integrable measurable envelope, which is obvious since all functions in

F

take on values in

[0, 1]

. Accordingly, also

sup_{K \in K_{c} (R^{p})} | D_{C T} (K; Γ) - D_{C T, n} (K; Γ) | ⟶ 0, a . s . [P] .

□

4.7. Property 7: Convexity of the Contours

This property is not part of any of the existing axiomatic notions of statistical depth. However, it has been commonly studied in the literature since it first appeared in Donoho and Gasko [35]. In addition, Serfling [36], which focuses on multivariate properties, lists it as a desirable property.

The set

K_{c} (R^{p})

is endowed with the operation’s sum and product by a scalar. Thus, given

U \subseteq K_{c} (R^{p})

, we can say that U is a convex set if

(1 - λ) \cdot K + λ \cdot L \in U

for every pair of sets

K, L \in U

and for all

λ \in [0, 1]

. We propose the following property.

(P7.): Let Γ be a compact convex random set and $D (\cdot; Γ) : K_{c} (R^{p}) \to [0, \infty)$ a function. Then, the set

$D_{α} : = {K \in K_{c} (R^{p}) : D (K; Γ) \geq α} \subseteq K_{c} (R^{p})$

is convex for every $α \in [0, 1]$ .

The next result states that the function

D_{C T}

satisfies the above property, that is, the

α

-contours of

D_{C T}

are convex subsets of

K_{c} (R^{p})

.

Theorem 3.

The function

D_{C T}

satisfies P7.

Proof.

Let us fix

α \in [0, 1]

,

K, L \in D_{α}

, and

λ \in [0, 1]

. The aim is to prove

(1 - λ) \cdot K + λ \cdot L \in D_{α} .

For that, we follow the same idea of the proof of Proposition 3. By the definition of Tukey depth,

D_{C T} ((1 - λ) \cdot K + λ \cdot L; Γ) = min {inf_{\underset{(1 - λ) \cdot K + λ \cdot L \in S_{u, t}^{-}}{u \in S^{p - 1}, t \in R :}} P (Γ \in S_{u, t}^{-}), inf_{\underset{(1 - λ) \cdot K + λ \cdot L \in S_{u, t}^{+}}{u \in S^{p - 1}, t \in R :}} P (Γ \in S_{u, t}^{+})} .

We now prove that

inf_{\underset{(1 - λ) \cdot K + λ \cdot L \in S_{u, t}^{-}}{u \in S^{p - 1}, t \in R :}} P (Γ \in S_{u, t}^{-}) \geq α .

As in the proof of Proposition 3, we define the following sets

\begin{matrix} K & : = {(u, t) \in S^{p - 1} \times R : (1 - λ) \cdot K + λ \cdot K \in S_{u, t}^{-}}, \\ K_{1} & : = {(u, t) \in S^{p - 1} \times R : K, L \in S_{u, t}^{-}, L \in S_{u, t}^{-}}, \\ K_{2} & : = {(u, t) \in S^{p - 1} \times R : K \in S_{u, t}^{-}, L \notin S_{u, t}^{-}, (1 - λ) \cdot K + λ \cdot L \in S_{u, t}^{-}}, \\ K_{3} & : = {(u, t) \in S^{p - 1} \times R : K \notin S_{u, t}^{-}, L \in S_{u, t}^{-}, (1 - λ) \cdot K + λ \cdot L \in S_{u, t}^{-}} . \end{matrix}

It is clear that

inf_{(u, t) \in K} P (Γ \in S_{u, t}^{-}) = min {inf_{(u, t) \in K_{1}} P (Γ \in S_{u, t}^{-}), inf_{(u, t) \in K_{2}} P (Γ \in S_{u, t}^{-}), inf_{(u, t) \in K_{3}} P (Γ \in S_{u, t}^{-})} .

Taking into account (13) and the fact that

D_{C T} (K; Γ), D_{C T} (L; Γ) \geq α

, we have that

inf_{(u, t) \in K_{i}} P (Γ \in S_{u, t}^{-}) \geq α

for every

i \in {1, 2, 3}

. The case with

S_{u, t}^{+}

is conducted analogously. Thus,

D_{C T} ((1 - λ) \cdot K + λ \cdot L; Γ) \geq α

and

D_{C T} {(\cdot; Γ)}_{α}

is a convex set. □

5. Real-Data Application

There are many examples of real interval-valued data. We comment here on some examples that are present in different fields of science where the elements of the dataset are in

K_{c} (R^{p})

with

p > 1 .

One of these examples is the Greek wines dataset [37], a real dataset with elements in the space

K_{c} (R^{24}) \times R^{7} .

There, measures of some properties of Greek wines are studied. They include interval-valued variables, such as the mineral ion concentration, the phenol concentrations, or the anthocyanin concentration, and numerical values, such as astringency, sweetness, or acidity.

Another example of compact and convex random sets is about measures related to some tree species [38]. In particular, the maximum and minimum values of the volume of the trunk and of the height of the tree species are measured. Thus, the resulting data are rectangles in

R^{2} .

A third dataset is of compact convex square data related to unemployment in Portugal [39]. It contains measurements of the unemployment period and the period of activity before unemployment for some patients.

The rest of this section is dedicated to computing the Tukey depth of a real dataset made of compact convex sets in

R^{3},

studying the elements of minimum and maximum depth, and comparing this last one with the Aumann mean and the trimmed Aumann mean.

5.1. Dataset

The dataset studied in what follows is a cardiology dataset comprised of three-dimensional cuboids with the ranges over a day of pulse rate, systolic blood pressure, and diastolic blood pressure of 59 patients. It was collected in 1997 by the Nephrology Unit of the Hospital Valle del Nalón in Langreo, Spain, and it has been applied before in the literature, see, for instance, [40]. For the sake of illustration, the dataset is graphically represented in Figure 1, and part of it is included in Table 1.

Figure 1. Representation of the cardiology three-dimensional cuboid dataset. The x-axes represent, for each patient, the range of the blood pulse over a same day, the y-axes the range of the systolic blood pressure over the same day, and the z-axes the range of the diastolic blood pressure over the same day. There are a total of 59 patients, with one cuboid per patient.

Table 1. Cardiology three-dimensional cuboid dataset for some patients. Columns 2 and 6, named Pulse, contain the range of blood pulse over a day for each patient, labelled by an identification number (ID) in columns 1 and 5. Columns 3 and 7, named Systolic, provide the range of systolic blood pressure over the same day per patient. Columns 4 and 8, named Diastolic, display the range of diastolic blood pressure over the same day per patient.

From Table 1 we can observe that the dataset consists of 59 rectangular cuboids, in

R^{3}

; one per patient. We denote each cuboid by

C_{i} : = [m P_{i}, M P_{i}] \times [m S_{i}, M S_{i}] \times [m D_{i}, M D_{i}]

for

i = 1, \dots, 59 .

There,

$[m P_{i}, M P_{i}]$ denotes the range of blood pulse over a day of patient $i,$ with $m P_{i}$ being the minimal value and $M P_{i}$ the largest,
$[m S_{i}, M S_{i}]$ the range of systolic blood pressure over the same day of patient i and
$[m D_{i}, M D_{i}]$ the same but for diastolic blood pressure.

As observable from Table 1,

C_{1} = [58, 90] \times [118, 173] \times [63, 102]

(17)

for instance. Each cuboid

C_{i}

is also represented by its eight vertices, which are points in

R^{3} .

With the above notation, these vertices are

\begin{matrix} (m P_{i}, m S_{i}, m D_{i}), (m P_{i}, m S_{i}, M D_{i}), (m P_{i}, M S_{i}, m D_{i}), (m P_{i}, M S_{i}, M D_{i}), \\ (M P_{i}, m S_{i}, m D_{i}), (M P_{i}, m S_{i}, M D_{i}), (M P_{i}, M S_{i}, m D_{i}) and (M P_{i}, M S_{i}, M D_{i}) . \end{matrix}

5.2. Tukey Depth Computation

Let us denote by

T

the compact convex random set corresponding to the empirical distribution of

{C_{i}}_{i = 1}^{59}

; that is, each cuboid has the probability given by its relative frequency in the dataset, in our case

1 / 59

. Additionally, let us denote using

V_{1}, \dots, V_{8}

the multivariate random variables corresponding to the empirical distribution associated with

\begin{matrix} {(m P_{i}, m S_{i}, m D_{i})}_{i = 1}^{59}, {(m P_{i}, m S_{i}, M D_{i})}_{i = 1}^{59}, {(m P_{i}, M S_{i}, m D_{i})}_{i = 1}^{59}, \\ {(m P_{i}, M S_{i}, M D_{i})}_{i = 1}^{59}, {(M P_{i}, m S_{i}, m D_{i})}_{i = 1}^{59}, {(M P_{i}, m S_{i}, M D_{i})}_{i = 1}^{59}, \\ {(M P_{i}, M S_{i}, m D_{i})}_{i = 1}^{59} and {(M P_{i}, M S_{i}, M D_{i})}_{i = 1}^{59}, respectively . \end{matrix}

To compute the Tukey depth of each cuboid in the dataset, it suffices to calculate the minimum of the multivariate Tukey depth in

R^{3}

of each vertex of the cuboid. Thus, given a cuboid

C_{i}

, its Tukey depth with respect to

T

is

\begin{matrix} D_{C T} (C_{i}, T) = min { & H D ((m P_{i}, m S_{i}, m D_{i}); V_{1}), H D ((m P_{i}, m S_{i}, M D_{i}); V_{2}), \\ H D ((m P_{i}, M S_{i}, m D_{i}); V_{3}), H D ((m P_{i}, M S_{i}, M D_{i}); V_{4}) \\ H D ((M P_{i}, m S_{i}, m D_{i}); V_{5}), H D ((M P_{i}, m S_{i}, M D_{i}); V_{6}) \\ H D ((M P_{i}, M S_{i}, m D_{i}); V_{7}), H D ((M P_{i}, M S_{i}, M D_{i}); V_{8})}, \end{matrix}

where

H D (x; V)

denotes the multivariate halfspace depth of

x \in R^{3}

with respect to

V .

Table 2 provides the obtained depth values for each element in the dataset, that is, the values

{D_{C T} (C_{i}; T)}_{i = 1}^{59}

. Taking into account these values, we have that the element

C_{1}

in (17) has the maximum depth, it is the deepest one, and the elements in the following set have minimum depths

\begin{matrix} { & C_{2}, C_{3}, C_{4}, C_{6}, C_{9}, C_{10}, C_{12}, C_{13}, C_{15}, C_{17}, C_{19}, C_{20}, C_{23}, C_{24}, C_{25}, C_{27}, C_{28}, C_{29}, \\ C_{30}, C_{31}, C_{34}, C_{35}, C_{38}, C_{39}, C_{40}, C_{41}, C_{42}, C_{44}, C_{49}, C_{50}, C_{51}, C_{53}, C_{55}, C_{56}, C_{58}, C_{59}} . \end{matrix}

(18)

Table 2. Tukey depth value of each element in the cardiology three-dimensional cuboid dataset.

To display this information, Figure 2 represents the sets of maximum and minimum depth. In particular, the left panel of the Figure represents the five deeper cuboids, with the sets of maximum depth colored in red. Meanwhile, the right panel of the Figure represents the sets with minimum depth in color blue. That is, those in (18). In addition, the right panel of the figure also displays

C_{1}

, the cuboid with maximum depth, in red. This is completed in order to visualize that the ordering given by the Tukey depth is natural, and the element

C_{1}

is the deepest set with respect to the cloud of cuboids.

Figure 2. Representation of the sets with maximum and minimum depths. The left panel represents the five sets of maximum depth with the deepest one,

C_{1},

in red. The right panel represents the sets with minimum depth, in (18), and again the set

C_{1}

in red.

One may think that it is possible to compute the Tukey depth of each cuboid by considering the variables Pulse, Systolic, and Diastolic separately. Let

P, S

, and

D

denote the compact convex random sets corresponding to the empirical distribution associated with

{[m P_{i}, M P_{i}]}_{i = 1}^{59}, {[m S_{i}, M S_{i}]}_{i = 1}^{59} and {[m D_{i}, M D_{i}]}_{i = 1}^{59},

respectively. Additionally, let us denote by

m P, M P, m S, M S, m D and M D

the real random variables corresponding to the empirical distribution associated with

{m P_{i}}_{i = 1}^{59}, {M P_{i}}_{i = 1}^{59}, {m S_{i}}_{i = 1}^{59}, {M S_{i}}_{i = 1}^{59}, {m D_{i}}_{i = 1}^{59}, and {M D_{i}}_{i = 1}^{59},

respectively. Given an index

i \in {1, \dots, 59}

, the Tukey depth of the i-th interval element with respect to

P, S

, and

D

are

\begin{matrix} D_{C T} ([m P_{i}, M P_{i}]; P) = min {H D (m P_{i}; m P), H D (M P_{i}; M P)}, \\ D_{C T} ([m S_{i}, M S_{i}]; S) = min {H D (m S_{i}; m S), H D (M S_{i}; M S)}, and \\ D_{C T} ([m D_{i}, M D_{i}]; D) = min {H D (m D_{i}; m D), H D (M D_{i}; M D)} . \end{matrix}

The element with the maximum depth with respect to

P

is the 48-th element, which has a depth value of

0.08474

with respect to

T

. The elements with maximum depths with respect to

S

and

D

are the 28-th and 19-th element, respectively, which have minimum depth values with respect to

T

. Thus, it is clear that we must consider all three variables simultaneously.

The calculation of the Tukey depth breaks the dataset into an outer layer of 36 patients with depth

1 / 59

, which envelopes an inner core of 23 patients with higher depth. The depth value

1 / 59

means that, taking the support function in a certain direction in

R^{3}

, the point is separated from the remainder of the data. Since each direction represents a linear combination of all three variables, there is some combination of weights for the variables which distinguishes that patient from all others. That suggests that many different patterns of behavior between the three variables are within the ordinary.

5.3. Aumann Mean

We first compute the Aumann mean,

{\hat{μ}}_{A}

, for the complete dataset. The Aumann mean is a generalization of the real-valued mean that works for compact convex sets. We then compare it with the Aumann mean of the dataset after removing the cuboids with minimum depth,

{\hat{μ}}_{t A} .

The Aumann mean of the complete dataset is

\begin{matrix} {\hat{μ}}_{A} = & [\frac{1}{59} \sum_{i = 1}^{59} m P_{i}, \frac{1}{59} \sum_{i = 1}^{59} M P_{i}] \times [\frac{1}{59} \sum_{i = 1}^{59} m S_{i}, \frac{1}{59} \sum_{i = 1}^{59} M S_{i}] \times [\frac{1}{59} \sum_{i = 1}^{59} m D_{i}, \frac{1}{59} \sum_{i = 1}^{59} M D_{i}] \\ = & [53.97, 95.07] \times [111.83, 181.58] \times [58.64, 108.25] . \end{matrix}

When we consider the inner core of the dataset by removing the set of cuboids with minimal depth (set in Equation (18)), the Aumann mean becomes

{\hat{μ}}_{t A} = [54.34783, 91.47826] \times [112.2174, 178.4348] \times [59.82609, 107.4783] .

This is conceptually similar to a trimmed mean (but trims more than half of the sample). The mean values are very similar, meaning that data in the outer layer have a similar average behavior to those in the inner core, and their outlier nature exerts little influence. In that situation, one would expect that the deepest point to be close to those means, and indeed the maximal depth in the sample is reached at

C_{1} = [58, 90] \times [118, 173] \times [63, 102]

, which is also very similar albeit the intervals are a bit narrower.

We have that both means,

{\hat{μ}}_{A}

and

{\hat{μ}}_{t A},

have similar values in every variable. This can be explained by the fact that some linear combination between the elements with minimal depth exists that distinguishes them from the rest of the dataset, but this does not affect the mean. Note that the set with maximal depth,

C_{1} = [58, 90] \times [118, 173] \times [63, 102]

, is also very similar to the above means.

6. Discussion

Considering the properties studied in the literature for depth functions, we propose nine different properties for depth functions with respect to compact convex random sets. They are:

P1. Affine invariance,
P2. Maximality at the center of symmetry,
P3a. Monotonicity with respect to the center in an algebraic way,
P3b. Monotonicity with respect to the center in relation to the associated distance (in a geometric way),
P4a. Vanishing at infinity in an algebraic way,
P4b. Vanishing at infinity in a geometric way,
P5. Upper semi-continuity,
P6. Consistency, and
P7. Convexity of the contours.

It is clear that all of them are desirable properties for a depth function of compact convex sets. However, not all of them have to be part of an axiomatic definition. For instance, it seems appropriate to have either P3a. and P4a. or P3b. and P4b. At the same time, P7., although important, does not belong to any of the existing axiomatic definitions, and P5. and a general case of P6. only belong to the functional (metric) axiomatic definition of statistical depth.

Taking all of this into account, we propose to consider:

the algebraic depth of compact convex sets, when properties P1., P2., P3a., and P4a. are satisfied;
the restricted algebraic depth of compact convex sets, when properties P1., P2., P3a., P4a., P5., P6., and P7. are satisfied;
the geometric depth of compact convex sets, when properties P1., P2., P3b., and P4b. are satisfied; and
the restricted geometric depth of compact convex sets, when properties P1., P2., P3b., P4b., P5., P6., and P7. are satisfied.

Note that the algebraic depth can be considered to be an adaptation of the notions of multivariate depth and of semi-linear fuzzy depth. Meanwhile, the geometric depth can be seen as a conversion of the geometric fuzzy depth and the restricted geometric depth as a modification of the functional (metric) depth.

We have studied the satisfaction of the above properties for the Tukey depth of compact convex sets, which is an adaptation of this setting of the multivariate Tukey depth and a simplification of the Tukey for fuzzy sets. It happens that this depth function satisfies all of these properties but for P3b., for which we have provided a counterexample. Thus, the Tukey depth of compact convex sets is a restricted algebraic depth and, in particular, an algebraic depth. However, it is not a geometric depth, and, consequently, neither is it a restricted geometric depth.

Cascos et al. [24] proposed a notion of depth for random closed sets. They require properties P1, P5 (for the Fell topology instead of the Hausdorff metric), and the property that a degenerate random set should assign depth 1 to its only value and 0 to any other random set. Admitting unbounded sets as values leads to some defining properties of depth being hard to adapt; a situation they solve by opting for a minimal list of properties. It is worth mentioning that, in the case of compact convex values, convergence in the Fell topology and in the Hausdorff metric are equivalent ([41], Corollary 3A). Hence, both upper semi-continuity requirements are equivalent for the Tukey depth, and Proposition 7 provides a proof of upper semi-continuity with respect to the Fell topology. Such a proof is missing in [24] on the grounds of it being ‘easy’ (a direct proof without invoking extra facts does not seem to be that easy).

Author Contributions

Writing—original draft preparation, L.G.-D.L.F., A.N.-R., and P.T.; supervision, A.N.-R.; funding acquisition, L.G.-D.L.F. and A.N.-R. All authors have read and agreed to the published version of the manuscript.

Funding

For L.G.-D.L.F. and A.N.-R., this research was supported by grant MTM2017-86061-C2-2-P funded by MCIN/AEI/10.13039/501100011033 and “ERDF A way of making Europe”. P.T. was supported by the Ministerio de Economía y Competitividad grant MTM2015-63971-P, the Ministerio de Ciencia, Innovación y Universidades grant PID2019-104486GB-I00, and the Consejería de Empleo, Industria y Turismo del Principado de Asturias grant GRUPIN-IDI2018-000132.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The studied dataset is available at http://bellman.ciencias.uniovi.es/SMIRE/Hospital.html.

Acknowledgments

We are grateful to the SMIRE–CODIRE group for making their cardiology dataset publicly available on their website.

Conflicts of Interest

The authors declare no conflict of interest.

References

Gil, M.Á.; Lubiano, M.A.; Montenegro, M.; López, M.T. Least squares fitting of an affine function and strength of association for interval-valued data. Metrika 2002, 56, 97–111. [Google Scholar] [CrossRef]
de Lima Neta, E.A.; de Carvalho, F.A.T. Nonlinear regression applied to interval-valued data. Patt. Anal. Appl. 2017, 20, 809–824. [Google Scholar] [CrossRef]
Molchanov, I. Theory of Random Sets, 2nd ed.; Springer: London, UK, 2017. [Google Scholar]
Artstein, Z.; Vitale, R.A. A strong law of large numbers for random compact convex sets. Ann. Probab. 1975, 3, 879–882. [Google Scholar] [CrossRef]
González-Rodríguez, G.; Blanco, A.; Corral, N.; Colubi, A. Least squares estimation of linear regression models for convex compact convex random sets. Adv. Data Anal. Classif. 2007, 1, 67–81. [Google Scholar] [CrossRef]
Sinova, B.; Casals, M.R.; Colubi, A.; Gil, M.Á. The median of a random interval. In Combining Soft Computing and Statistical Methods in Data Analysis; Springer: Berlin/Heidelberg, Germany, 2010; pp. 575–583. [Google Scholar]
Richey, J.; Sarkar, A. Intersections of random sets. J. Appl. Probab. 2022, 59, 131–151. [Google Scholar] [CrossRef]
Shi, P.; Lu, L.; Fan, X.; Xin, Y.; Ni, J. A novel underwater sonar image enhancement algorithm based on approximation spaces of random sets. Multimed. Tools. Appl. 2022, 81, 4569–4584. [Google Scholar] [CrossRef]
Jörnsten, R. Clustering and classification based on the L1 data depth. J. Multivar. Anal. 2004, 90, 67–89. [Google Scholar] [CrossRef]
Nieto-Reyes, A.; Battey, H.; Francisci, G. Functional Symmetry and Statistical Depth for the Analysis of Movement Patterns in Alzheimer’s Patients. Mathematics 2021, 9, 820. [Google Scholar] [CrossRef]
Nieto-Reyes, A.; Duque, R.; Francisci, G. A Method to Automate the Prediction of Student Academic Performance from Early Stages of the Course. Mathematics 2021, 9, 2677. [Google Scholar] [CrossRef]
Liu, R.Y. On a notion of data depth based on random simplices. Ann. Stat. 1990, 18, 405–414. [Google Scholar] [CrossRef]
Zuo, Y.; Serfling, R. General notions of statistical depth function. Ann. Stat. 2000, 28, 461–482. [Google Scholar]
Nieto-Reyes, A.; Battey, H. A topologically valid construction of depth for functional data. J. Multivar. Anal. 2021, 184, 104738. [Google Scholar] [CrossRef]
Gónzalez-de la Fuente, L.; Nieto-Reyes, A.; Terán, P. Statistical depth for fuzzy sets. Fuzzy Sets Syst. 2022, 443 Pt A, 58–86. [Google Scholar] [CrossRef]
Nieto-Reyes, A.; Battey, H. A topologically valid definition of depth for functional data. Stat. Sci. 2016, 31, 61–79. [Google Scholar] [CrossRef]
Gónzalez-de la Fuente, L.; Nieto-Reyes, A.; Terán, P. Two notions of depth in the fuzzy setting. In Building Bridges between Soft and Statistical Methodologies for Data Science; García-Escudero, L., Gordaliza, A., Mayo, A., Gomez, M.A.L., Gil, M.A., Grzegorzewski, P., Hryniewicz, O., Eds.; Springer Cham: Berlin/Heidelberg, Germany, 2023; to appear. [Google Scholar]
Tukey, J.W. Mathematics and Picturing Data. In Proceedings of the International Congress of Mathematicians, Vancouver, BC, Canada, 21–29 August 1974; Canadian Mathematical Congress: Montreal, QC, Canada, 1975; pp. 523–531. [Google Scholar]
Serfling, R. A depth function and a scale curve based on spatial quantiles. In Statistical Data Analysis Based on L₁-norm and Related Methods; Dodge, Y., Ed.; Birkhäuser: Basel, Germany, 2002; pp. 25–38. [Google Scholar]
Cuesta-Albertos, J.A.; Nieto-Reyes, A. The random Tukey depth. Comput. Stat. Data Anal. 2008, 52, 4979–4988. [Google Scholar] [CrossRef]
Chakraborty, A.; Chaudhuri, P. The spatial distribution in infinite dimensional spaces and related quantiles and depths. Ann. Stat. 2014, 42, 1203–1231. [Google Scholar] [CrossRef] [Green Version]
Cuesta-Albertos, J.A.; Nieto-Reyes, A. Functional classification and the random Tukey depth. Practical issues. In Combining Soft Computing and Statistical Methods in Data Analysis; Borgelt, C., González-Rodríguez, G., Trutsching, W., Lubiano, M.A., Gil, M.A., Grzegorzewski, P., Hryniewicz, O., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; Volume 77, pp. 123–130. [Google Scholar]
Gónzalez-de la Fuente, L.; Nieto-Reyes, A.; Terán, P. Tukey depth for fuzzy sets. In Building Bridges between Soft and Statistical Methodologies for Data Science; García-Escudero, L., Gordaliza, A., Mayo, A., Gomez, M.A.L., Gil, M.A., Grzegorzewski, P., Hryniewicz, O., Eds.; Springer Cham: Berlin/Heidelberg, Germany, 2023; to appear. [Google Scholar]
Cascos, I.; Li, Q.; Molchanov, I. Depth and outliers for samples of sets and random sets distributions. Aust. N. Z. Stat. 2021, 63, 55–82. [Google Scholar] [CrossRef]
Matheron, G. Random Sets and Integral Geometry; Wiley: New York, NY, USA, 1975. [Google Scholar]
Himmelberg, C. Measurable relations. Fund. Math. 1974, 87, 53–72. [Google Scholar] [CrossRef] [Green Version]
Bonnensen, T.; Fenchel, W. Theorie der Konvexen Korper; Chelsea: New York, NY, USA, 1948. [Google Scholar]
Zadeh, L.A. The concept of a linguistic variable and its application to approximate reasoning, Part 1. Inform. Sci. 1975, 8, 199–249. [Google Scholar] [CrossRef]
Zadeh, L.A. The concept of a linguistic variable and its application to approximate reasoning, Part 2. Inform. Sci. 1975, 8, 301–353. [Google Scholar] [CrossRef]
Zadeh, L.A. The concept of a linguistic variable and its application to approximate reasoning, Part 3. Inform. Sci. 1975, 8, 43–80. [Google Scholar] [CrossRef]
Gruber, P.M.; Lettl, G. Isometries of the Space of Convex Bodies in Euclidean Space. Bull. Lond. Math. Soc. 1980, 12, 455–462. [Google Scholar] [CrossRef]
Vitale, R.A. L_p metrics for compact, convex sets. J. Approx. Theory 1985, 45, 280–287. [Google Scholar] [CrossRef] [Green Version]
Massart, P. The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality. Ann. Probab. 1990, 18, 1269–1283. [Google Scholar] [CrossRef]
Giné, E.; Nickl, R. Mathematical Foundations of Infinite-Dimensional Statistical Models; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
Donoho, D.L.; Gasko, M. Breakdown properties of location estimates based on halfspace depth and projected outlyinges. Ann. Stat. 1992, 20, 1803–1827. [Google Scholar] [CrossRef]
Serfling, R. Depth Functions in Nonparametric Multivariate Inference. In Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications; DIMACS Series in Discrete Mathematics and Theoretical Computer Science; AMS: New Brunswick, NJ, USA, 2006. [Google Scholar]
Kallithrakaa, S.; Arvanitoyannis, I.; Kefalasa, P.; El-Zajoulia, A.; Soufleros, E.; Psarra, E. Instrumental and sensory analysis of Greek wines; implementation of principal component analysis (PCA) for classification according to geographical origin. Food Chem. 2001, 73, 501–514. [Google Scholar] [CrossRef]
da Silva, J.A.A.; Cordeiro, G.M.; Ferreira, R.L.C. Modeling the growth of eucalyptus clones using the chapman-richards model with different symmetrical error distributions. Ciência Florest. 2012, 22, 777–785. [Google Scholar]
Dias, S.; Brito, P. Off the beaten track: A new linear model for interval data. Eur. J. Oper. Res. 2017, 258, 1118–1130. [Google Scholar] [CrossRef] [Green Version]
Lubiano, M.A. Medidas de Variación de Elementos Aleatorios Imprecisos. Ph.D. Thesis, University of Oviedo, Oviedo, Spain, 1999. [Google Scholar]
Salinetti, G.; Wets, R.J.B. On the convergence of sequences of convex sets in finite dimensions. SIAM Rev. 1979, 21, 18–33. [Google Scholar] [CrossRef]

Figure 1. Representation of the cardiology three-dimensional cuboid dataset. The x-axes represent, for each patient, the range of the blood pulse over a same day, the y-axes the range of the systolic blood pressure over the same day, and the z-axes the range of the diastolic blood pressure over the same day. There are a total of 59 patients, with one cuboid per patient.

Figure 2. Representation of the sets with maximum and minimum depths. The left panel represents the five sets of maximum depth with the deepest one,

C_{1},

in red. The right panel represents the sets with minimum depth, in (18), and again the set

C_{1}

in red.

Table 1. Cardiology three-dimensional cuboid dataset for some patients. Columns 2 and 6, named Pulse, contain the range of blood pulse over a day for each patient, labelled by an identification number (ID) in columns 1 and 5. Columns 3 and 7, named Systolic, provide the range of systolic blood pressure over the same day per patient. Columns 4 and 8, named Diastolic, display the range of diastolic blood pressure over the same day per patient.

ID	Pulse	Systolic	Diastolic	ID	Pulse	Systolic	Diastolic
1	$58 - 90$	$118 - 173$	$63 - 102$	31	$52 - 78$	$119 - 212$	$47 - 93$
2	$47 - 68$	$104 - 161$	$71 - 118$	32	$55 - 84$	$122 - 178$	$73 - 105$
…	…	…	…	…	…	…	…
28	$71 - 121$	$113 - 176$	$57 - 95$	58	$56 - 97$	$92 - 173$	$45 - 107$
29	$68 - 91$	$114 - 186$	$46 - 103$	59	$37 - 86$	$83 - 140$	$45 - 91$
30	$62 - 100$	$145 - 210$	$100 - 136$

Table 2. Tukey depth value of each element in the cardiology three-dimensional cuboid dataset.

ID	Depth Value	ID	Depth Value
1	0.15254	31	0.01694
2	0.01694	32	0.06779
3	0.01694	33	0.05084
4	0.01694	34	0.01694
5	0.03389	35	0.01694
6	0.01694	36	0.03389
7	0.08474	37	0.03389
8	0.03389	38	0.01694
9	0.01694	39	0.01694
10	0.01694	40	0.01694
11	0.05084	41	0.01694
12	0.01694	42	0.01694
13	0.01694	43	0.03389
14	0.13559	44	0.01694
15	0.01694	45	0.08474
16	0.03389	46	0.10169
17	0.01694	47	0.10169
18	0.03389	48	0.08474
19	0.01694	49	0.01694
20	0.01694	50	0.01694
21	0.03389	51	0.01694
22	0.03389	52	0.10169
23	0.01694	53	0.01694
24	0.01694	54	0.03389
25	0.01694	55	0.01694
26	0.10169	56	0.01694
27	0.01694	57	0.03389
28	0.01694	58	0.01694
29	0.01694	59	0.01694
30	0.01694

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Properties of Statistical Depth with Respect to Compact Convex Random Sets: The Tukey Depth

Abstract

1. Introduction

2. Preliminaries on Compact Convex Random Sets

3. Halfspaces and Halfspace Depth in $K_{c} (R^{p})$

Sample Halfspace Depth

4. Properties of Depth for Compact Convex Sets

4.1. Property 1: Affine Invariance

4.2. Property 2: Maximality at the Center of Symmetry

4.3. Property 3: Monotonicity with Respect to the Center

4.4. Property 4: Vanishing at Infinity

4.5. Property 5: Upper Semi-Continuity

4.6. Property 6: Consistency

4.7. Property 7: Convexity of the Contours

5. Real-Data Application

5.1. Dataset

5.2. Tukey Depth Computation

5.3. Aumann Mean

6. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

Properties of Statistical Depth with Respect to Compact Convex Random Sets: The Tukey Depth

Abstract

1. Introduction

2. Preliminaries on Compact Convex Random Sets

3. Halfspaces and Halfspace Depth in K c ( R p )

Sample Halfspace Depth

4. Properties of Depth for Compact Convex Sets

4.1. Property 1: Affine Invariance

4.2. Property 2: Maximality at the Center of Symmetry

4.3. Property 3: Monotonicity with Respect to the Center

4.4. Property 4: Vanishing at Infinity

4.5. Property 5: Upper Semi-Continuity

4.6. Property 6: Consistency

4.7. Property 7: Convexity of the Contours

5. Real-Data Application

5.1. Dataset

5.2. Tukey Depth Computation

5.3. Aumann Mean

6. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

3. Halfspaces and Halfspace Depth in $K_{c} (R^{p})$