Next Article in Journal
Lie Symmetry Analysis and Conservation Laws of the Axially Loaded Euler Beam
Next Article in Special Issue
Statistical Inference of Wiener Constant-Stress Accelerated Degradation Model with Random Effects
Previous Article in Journal
Forecasting Crude Oil Future Volatilities with a Threshold Zero-Drift GARCH Model
Previous Article in Special Issue
High-Dimensional Statistics: Non-Parametric Generalized Functional Partially Linear Single-Index Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Properties of Statistical Depth with Respect to Compact Convex Random Sets: The Tukey Depth

by
Luis González-De La Fuente
1,
Alicia Nieto-Reyes
1,*,† and
Pedro Terán
2
1
Departamento de Matemáticas, Estadística y Computación, Universidad de Cantabria, 39005 Santander, Spain
2
Departamento de Estadística e Investigación Operativa y Didáctica de las Matemáticas, Universidad de Oviedo, 33007 Oviedo, Spain
*
Author to whom correspondence should be addressed.
Current address: Facultad de Ciencias, Avd. de los Castros s/n, 39005 Santander, Spain.
Mathematics 2022, 10(15), 2758; https://doi.org/10.3390/math10152758
Submission received: 27 May 2022 / Revised: 27 July 2022 / Accepted: 28 July 2022 / Published: 3 August 2022

Abstract

:
We study a statistical data depth with respect to compact convex random sets, which is consistent with the multivariate Tukey depth and the Tukey depth for fuzzy sets. In addition, it provides a different perspective to the existing halfspace depth with respect to compact convex random sets. In studying this depth function, we provide a series of properties for the statistical data depth with respect to compact convex random sets. These properties are an adaptation of properties that constitute the axiomatic notions of multivariate, functional, and fuzzy depth-functions and other well-known properties of depth.

1. Introduction

In some real cases, statistical data appear in the form of sets, for instance, in the form of compact convex sets. Examples can be found in datasets related to health, such as the range of blood pressure over a day [1], or related to sport measures, such as the range of weights and heights of a soccer team [2]. Thanks to these phenomena having a convex compact set nature, it is possible to use some good properties of convex compact sets, for instance the existence of support functions. This type of statistical data is studied by the theory of random sets, which, from a statistical point of view, models observed phenomena that are sets rather than points in R p , as in multivariate statistics, or functions, as in functional data analysis. Thus, a random set is a generalization of a random variable: it is a set-valued random variable. A random set can also be understood as a simplification of a fuzzy random variable, as the α -levels of a fuzzy set are nested compact sets. The literature about random sets contains well-established theoretical results [3], some of which are generalizations to random sets of classical statistical results, for instance, the strong law of large numbers [4]. Statistical methods are also part of the development of the area of compact convex random sets, such as proposing linear regression methods [5] or the median of a random interval [6]. Recent literature also includes theoretical results, such as results about the intersection of random sets [7], and applications, such as underwater sonar images [8].
Statistical depth functions have become a very useful tool in non-parametric statistics. Nowadays, depth functions are applied in different fields of statistics, such as clustering and classification [9] or real data analysis [10,11]. Given a distribution P in a space, a depth function, D ( · ; P ) , orders the elements in the space with respect to P . Roughly speaking, statistical depth functions measure how close an element is to a data cloud, in the sense that, if we move the element to the center of the cloud, its depth increases, and, if we move it out of the center, its depth decreases. Assuming it is unique, this center is the center of symmetry if the distribution is symmetric for a particular notion of symmetry. For multivariate spaces, there are notions of symmetry widely used in the literature: central, angular [12], and halfspace symmetry [13]. Notions of symmetry specific for functional [14] and fuzzy spaces [15] are, however, quite recent.
Formally, an axiomatic definition of the depth function for the multivariate case was proposed by Zuo and Serfling [13]. According to this definition, a depth function, D ( · ; P ) , satisfies the following properties. To introduce them, let X be a random variable with distribution P on R n , M n × n ( R ) be the space of n × n matrices with entries in R , and · be the Euclidean norm. Abusing the notation, we indistinctly write D ( · ; X ) and D ( · ; P ) .
M1.
Affine invariance. A depth function does not depend on the coordinate system, that is, for any non-singular M M n × n ( R ) and b R n , D ( x ; X ) = D ( M x + b ; M X + b ) .
M2.
Maximality at center. If the distribution P has a uniquely defined center of symmetry, for a certain notion of symmetry D ( · ; X ) is maximized at it.
M3.
Monotonicity relative to the deepest point. Let x 0 R n be a point of maximal depth. Then, for any x R n , D ( ( 1 λ ) x 0 + λ x ; X ) D ( x ; X ) for all λ [ 0 , 1 ] .
M4.
Vanishing at infinity. The limit of D ( x ; X ) goes to 0, as the limit of x goes to infinity.
Formal axiomatic definitions of a depth function were later provided in the functional [16] and fuzzy settings [15,17].
The first instance of a depth function was proposed prior to the axiomatic definitions. It is the Tukey depth, an instance provided in 1975 by Tukey [18] for multivariate data, which is still the most well-known depth function. It is also known as halfspace depth, as it computes the infimum of the probabilities of closed halfspaces, which contain the point at which the depth function is evaluated. That is:
H D ( x ; P ) : = inf { P ( H ) : H is a closed halfspace and x H } .
Zuo and Serfling [13] proved that H D satisfies M1-M4, and, therefore, it is a statistical depth function. We emphasize the satisfaction of the axioms, because it is customary in the statistical depth community not to consider the axioms as cut-off, regarding a function as a depth function, even when all the axioms are not satisfied in their entirety.
Since Tukey coined the term in 1975, many other instances of depth functions have been proposed, and their use in statistics has grown considerably. Some commonly used depth functions are the simplicial depth, proposed by Liu [12]; the spatial depth, proposed by Serfling [19]; and the random Tukey depth, proposed by Cuesta-Albertos and Nieto-Reyes [20], which, being based on random projections, is a computationally effective approximation of the Tukey depth. The spatial and random Tukey depth functions can be applied in both multivariate and functional spaces [21,22]. However, the random Tukey depth does not satisfy the axiomatic definition of a functional depth [16], which only the metric depth [14] has yet been proven to satisfy. It is worth noting that the spatial and random Tukey depth functions were introduced before the functional axiomatic definition in [16]. Furthermore, while the Tukey depth has not yet being defined in functional spaces, it has being generalized to the fuzzy setting and proved to satisfy the axiomatic definitions in that setting [15,23].
The aim of this paper is to propose some desirable properties of depth with respect to compact convex random sets, which can be considered to be an axiomatic definition for this setting. Some of these properties are an adaptation for compact convex sets of those proposed in González-De La Fuente et al. [15] for fuzzy data. The properties are also largely inspired by the multivariate definition [13] and, in addition, by the functional one [16], because the set of compact convex sets can be considered to be a metric space by using the Hausdorff distance, for instance. In order to test the viability of those properties, with a generalization of halfspaces suitable for the space of compact convex sets, we present an adaptation of Tukey depth and show that almost all of them are satisfied. These definitions of halfspace and Tukey depth can be regarded as stemming naturally from their corresponding multivariate definitions and, in addition, are a particular case of their fuzzy analogs [15]. Furthermore, we show that the definition of Tukey depth with respect to compact convex random sets coincides with that derived recently in Cascos et al. [24], which does not make an explicit use of halfspaces in its definition. The advantage of using our proposal is that it helps in proving some desirable properties of the Tukey depth, for instance the monotonicity relative to the deepest point (see proof of Proposition 3). In addition, it is clear that our proposal is a natural generalization of the multivariate halfspace depth, because it generalizes the concept of halfspace to the set of subsets of R p . Moreover, we also show that the Tukey depth, with respect to compact convex random sets, can be rewritten in terms of the multivariate halfspace depth of the support function of compact convex sets.
The paper is organized as follows. The background about compact convex random sets is contained in Section 2. The definition of the Tukey depth with respect compact convex random sets is in Section 3, together with its relationships with and equivalences to other definitions. Section 4 presents and studies the properties of depth with respect to compact convex random sets and their satisfaction by the Tukey depth with respect to compact convex random sets. Section 5 includes a real-data analysis of compact convex sets in R 3 . The paper concludes with some final remarks in Section 6.

2. Preliminaries on Compact Convex Random Sets

Let us denote using K c ( R p ) the set of non-empty compact convex sets of R p . In the case p = 1 , the elements of K c ( R ) are intervals of the form [ a , b ] with a b . For any K K c ( R p ) , its support function s K : S p 1 R is defined by
s K ( u ) : = sup k K k , u ,
where · , · denotes the usual dot product, S p 1 : = { x R p : x = 1 } is the unit sphere, and · is the Euclidean norm.
Let ( Ω , A , P ) be a probability space. A map
Γ : Ω K c ( R p )
is called a compact convex random set if
{ ω Ω : Γ ( ω ) K } A
for all K K c ( R p ) [25]. Himmelberg [26] proved the Fundamental Measurability Theorem, which is useful to prove that s Γ ( u ) is a real random variable for all u S p 1 . As in the Euclidean space, in K c ( R p ) there exists a predominant distance, the Hausdorff metric. The Hausdorff distance between K K c ( R p ) and L K c ( R p ) is
d H ( K , L ) : = max { sup k K inf l L k l , sup l L inf k K k l } ,
which can be expressed in terms of their support function (e.g., [27]) as
d H ( K , L ) = sup u S p 1 | s K ( u ) s L ( u ) | .
The Borel measurability with respect to d H is equivalent to the above-mentioned definition of compact convex random sets.
Some properties of the support functions of the elements of K c ( R p ) can be deduced from the properties of the supremum function. For instance, let K , L K c ( R p ) , taking into account that
K + L = { k + l : k K , l L } K c ( R p ) ,
we can conclude that the support function of K + L can be expressed as the sum of the support functions of K and L, that is,
s K + L ( u ) = s K ( u ) + s L ( u )
for all u S p 1 . It is also possible to define the product of K by a scalar γ R + , as
γ · K = { γ k : k K } .
Then, it is clear that
s γ · K ( u ) = γ · s K ( u )
for all u S p 1 .

3. Halfspaces and Halfspace Depth in K c ( R p )

As is observable from (1), the Tukey depth of a multivariate point x is the infimum of the probability of halfspaces which contain x. However, K c ( R p ) is not a linear space. In this section, we define generalized halfspaces (simply called halfspaces in the sequel) for K c ( R p ) in a natural way from the multivariate case.
Let S be a halfspace of R n . Then, v R n and b R exist, such that
S = { y R n : v T y b } .
Taking u = ( 1 / v ) v S p 1 and c = b / v , it is clear that
S = { y R n : u T y c } .
Thus, the halfspaces of R n can be viewed as subsets S u , c R n , such that
S u , c = { y R n : u T y c }
with u S p 1 and c R . This generalizes naturally to K c ( R p ) by using the support function of a set. Thus, we define halfspaces S u , t , S u , t + K c ( R p ) as
S u , t : = { K K c ( R p ) : s K ( u ) t } ,
S u , t + : = { K K c ( R p ) : s K ( u ) t } ,
for all u S p 1 and t R . We explicitly consider both halfspaces because
s K ( u ) = inf k K u , k s K ( u )
with
S u , t + S u , t , S u , t S u , t + ,
for all u S p 1 and t R .
Making use of both directions of the inequality that defines the halfspaces, the Tukey depth with respect to a compact convex random set can be defined. Let Γ be a compact convex random set. The Tukey depth of K K c ( R p ) with respect to Γ is defined by the function
D C T ( · ; Γ ) : K c ( R p ) [ 0 , 1 ]
given by
D C T ( K ; Γ ) : = min { inf u S p 1 , t R : K S u , t P ( Γ S u , t ) , inf u S p 1 , t R : K S u , t + P ( Γ S u , t + ) } .
We indistinctively refer to it as the Tukey depth for compact convex random sets or the Tukey depth with respect to compact convex random sets. It is worth noting that (5) is a particularization for compact convex sets of the Tukey depth for fuzzy sets proposed in [15]; (3) and (4) are of the fuzzy halfspaces proposed there.
In what follows, we operate on (5) to show it coincides with the definition of halfspace depth with respect to compact convex random sets provided in Cascos et al. [24], which does not explicitly use halfspaces. From (3), K S u , t means that ( u , t ) is a pair such that s K ( u ) t . Thus,
S u , s K ( u ) S u , t
and, consequently,
P ( Γ S u , s K ( u ) ) P ( Γ S u , t ) .
Analogously, from (4),
P ( Γ S u , s K ( u ) + ) P ( Γ S u , t + ) .
Taking the infimum in (5), we can express D C T as
D C T ( K ; Γ ) = min { inf u S p 1 P ( Γ S u , s K ( u ) ) , inf u S p 1 P ( Γ S u , s K ( u ) + ) } .
Making use of the definition of the halfspaces in (3) and (4), we have
D C T ( K ; Γ ) = min { inf u S p 1 P ( s Γ ( u ) s K ( u ) ) , inf u S p 1 P ( s Γ ( u ) s K ( u ) ) } ,
which coincides with the definition of the halfspace depth proposed by Cascos et al. [24].
Interchanging the minimum and infimum in (6),
D C T ( K ; Γ ) = inf u S p 1 min { P ( s Γ ( u ) s K ( u ) ) , P ( s Γ ( u ) s K ( u ) ) } .
Then, taking into account (1), we can express the Tukey depth for compact convex random sets in terms of the multivariate halfspace depth in the following way
D C T ( K ; Γ ) = inf u S p 1 H D ( s K ( u ) ; s Γ ( u ) ) .

Sample Halfspace Depth

We define the sample version D C T , n of the Tukey depth for compact convex sets. Let
Γ : Ω K c ( R p )
be a compact convex random set associated with the probabilistic space ( Ω , A , P ) and X 1 , , X n independent random sets distributed as Γ . We define the sample version of the Tukey depth D C T , n as
D C T , n ( K ; Γ ) : = min { inf u S p 1 P n u ( ( , s K ( u ) ] ) , inf u S p 1 P n u ( [ s K ( u ) , ) ) } ,
for every K K c ( R p ) , where
P n u ( ( , x ] ) = 1 n · i = 1 n I ( s X i ( u ) ( , x ] ) , P n u ( [ x , ) ) = 1 n · i = 1 n I ( s X i ( u ) [ x , ) ) ,
for all u S p 1 and x R . The function D C T , n coincides with the sample version of the halfspace depth proposed by Cascos et al. [24]. Interchanging the minimum and infimum in (9), we also have that
D C T , n ( K ; Γ ) : = inf u S p 1 min { P n u ( ( , s K ( u ) ] ) , P n u ( [ s K ( u ) , ) ) } .

4. Properties of Depth for Compact Convex Sets

In this section, we propose some desirable properties for the depth for compact convex sets. They are mainly based on the properties that constitute the notion of the depth function for multivariate spaces [13], for functional (metric) spaces [16], and for the fuzzy setting [15]. Furthermore, we study whether D C T satisfies them.
Some of these properties parallel the ones considered in [15], and, in certain cases, they follow for a random set Γ by applying the corresponding property in [15] to the indicator function I Γ . However, this application is simplest for the properties whose direct proof is already very simple, which does not support the cost-effectiveness of doing so. In the longer proofs, additional arguments are needed, due, for instance, to the subtlety that the deepest point in the (larger) space of fuzzy sets might conceivably be deeper than the deepest non-fuzzy set. Therefore, the properties referring to deepest points are parallel in wording but might potentially have different content. It can be proved that this does not actually happen, but we also found that direct proofs make the paper more self-contained. Thus we opted for proofs which do not require the reader to be familiar with the specifics of fuzzy sets, by adapting the arguments in [15]. Still, some other properties in this section were not considered in [15].

4.1. Property 1: Affine Invariance

We focus on the M1. property of the multivariate case reported in the introduction. In the case of K c ( R p ) , the product of M M n × n ( R ) times K K c ( R p ) is defined as the compact convex set
M · K = { M · k : k K } .
The affine invariance property that we propose is the following.
(P1.)
Let Γ be a compact convex random set, D ( · ; Γ ) : K c ( R p ) [ 0 , ) a function. Then,
D ( M · K + L ; M · Γ + L ) = D ( K ; Γ ) ,
for all M M n × n ( R ) non-singular matrix and any K , L K c ( R p ) .
Thus, this property is analogous to the multivariate case. The property in the fuzzy case is different only in that we need the Zadeh’s extension principle [28,29,30] to apply a matrix to a fuzzy set. The property for functional data also differs, since [16] demands isometry invariance. However, note that, in this context, affine invariance actually implies isometry invariance, since, as a result of Gruber and Lettl [31], all isometries of K c ( R p ) are of the form K M · K + L with M orthogonal.
Proposition 1.
The function D C T satisfies P1.
The following lemma (cf. [15], Proposition 8.2) is used to prove Proposition 1.
Lemma 1.
Let K K c ( R p ) and M M n × n ( R ) a non-singular matrix. Then,
s M · K ( u ) = M T · u · s K ( ( 1 / ( M T · u ) ) · M T · u )
for all u S p 1 .
Proof. 
Taking into account (11), it is clear that
s M · K ( u ) = sup v M · K u , v = sup k K u , M · k = sup k K M T · u , k ,
for any u S p 1 . In general, M T · u does not belong to S p 1 . Thus, normalizing it, we have that
s M · K ( u ) = sup k K M T · u · 1 M T u · M T · u , k = M T · u · sup k K 1 M T · u · M T · u , k = M T · u · s K ( 1 M T · u · M T · u ) .
 □
It is clear that, if M M n × n ( R ) is a non-singular matrix, the map
f : S p 1 S p 1
defined by
f ( u ) = ( 1 / M T · u ) · M T · u
is bijective. We make use of this to prove Proposition 1.
Proof of Proposition 1.
Using the properties of the support function and Lemma 1, we obtain
s M · K + L ( u ) = M T · u · s K ( 1 M T · u · M T · u ) + s L ( u ) ,
for all u S p 1 . From (6), we have that
inf u S p 1 P ( s M · Γ + L ( u ) s M · K + L ( u ) ) = inf u S p 1 P ( s M · Γ ( u ) s M · K ( u ) ) = inf u S p 1 P ( s Γ ( 1 M T · u · M T · u ) s A ( 1 M T · u · M T · u ) ) = inf u S p 1 P ( s Γ ( u ) s K ( u ) )
where the last equality follows from the fact that f is bijective. □

4.2. Property 2: Maximality at the Center of Symmetry

In this case, the property is the same for multivariate, functional, and fuzzy settings, but for the fact that the notion of symmetry applied has to be defined in the corresponding space. In the multivariate case, several notions of symmetry exist, for instance central, angular, and halfspace symmetry [12,13]. In the functional case, one proved to be topologically valid exists [10,14], while there have been two proposals in the fuzzy setting [15]. To propose a notion of symmetry in K c ( R p ) , we make use of the central symmetry notion and of the support function of compact convex random sets. A random variable X on R p is centrally symmetric (or C-symmetric) with respect to x R p if X x and x X are equally distributed.
Definition 1.
Let Γ be a compact convex random set. We say that Γ is compact-symmetric with respect to K if s Γ ( u ) is C-symmetric with respect to s K ( u ) for all u S p 1 .
We propose the following property.
(P2.)
Let Γ be a compact convex random set which is symmetric (for a certain notion of symmetry) with respect to K K c ( R p ) . Let D ( · ; Γ ) : K c ( R p ) [ 0 , ) be a function. Then
D ( K ; Γ ) = sup L K c ( R p ) D ( L ; Γ ) .
Thus, this property is analogous in the multivariate, functional, and fuzzy cases. The only difference is the notion of symmetry defined for each case. Note that the above defined notion of symmetry for compact convex random sets, which makes use of C-symmetry, is also an adaptation of the F-symmetry [15] of the fuzzy case, based on support functions. It is possible to consider another notion of symmetry for random sets by identifying every set with its support function and considering central symmetry in the function space. However, our notion is more general, which makes it a natural choice.
With the above notion of compact-symmetry, we have the following result.
Proposition 2.
The function D C T satisfies P2.
Proof. 
By hypothesis, let us suppose that Γ is compact-symmetric with respect to K. By definition, we have that the real random variable s Γ ( u ) is C-symmetric with respect to s K ( u ) for all u S p 1 . This means that
s K ( u ) Med ( s Γ ( u ) )
for all u S p 1 , where Med ( · ) denotes the univariate median. It implies that
P ( s Γ ( u ) s K ( u ) ) 1 / 2   and   P ( s Γ ( u ) s K ( u ) ) 1 / 2 .
Using the expression of D C T in Equation (6), we have that D C T ( · ; Γ ) is maximized in K. □

4.3. Property 3: Monotonicity with Respect to the Center

In the multivariate case [13], this property is understood in an algebraic way, as the convex combinations between the element of maximal depth and another point are considered. As the operations of sum and product by a scalar are defined in K c ( R p ) , we can propose the same property.
(P3a.)
Let Γ be a compact convex random set and let K K c ( R p ) maximize D ( · ; Γ ) . Then,
D ( ( 1 λ ) · K + λ · L ; Γ ) D ( L ; Γ )
for all λ [ 0 , 1 ] and L K c ( R p ) .
Additionally, this property is analogous to property P3a. in the definition of semi-linear depth in the fuzzy setting [15].
In the functional (metric) case, a different property was proposed by (Nieto-Reyes and Battey [16], Property P-3.) which explicitly uses the metric in the space. We can see K c ( R p ) as a metric space with the Hausdorff metric d H . Thus, another possible property is the following.
(P3b.)
Let Γ be a compact convex random set, d be a metric in K c ( R p ) , and K , L , S K c ( R p ) be three sets such that K maximizes D ( · ; Γ ) and d ( K , S ) = d ( K , L ) + d ( L , S ) . Then,
D ( L ; Γ ) D ( S ; Γ ) .
This property is analogous to property P3b. in the definition of geometric depth in the fuzzy setting [15].
For these two possible translations of the multivariate property, we have the following two results.
Proposition 3.
The function D C T satisfies P3a.
Proof. 
Let Γ be a compact convex random set, and let K , L K c ( R p ) be two sets such that K maximizes D C T ( · ; Γ ) . Using the properties of the support function of a set, we have that
s ( 1 λ ) · K + λ · L ( u ) = ( 1 λ ) s K ( u ) + λ s L ( u )
for all u S p 1 and λ [ 0 , 1 ] .
We consider the set
K = { ( u , t ) S p 1 × R : ( 1 λ ) · K + λ · L S u , t } .
It can be expressed as K 1 K 2 K 3 , where
K 1 = { ( u , t ) S p 1 × R : K , L S u , t , L S u , t } , K 2 = { ( u , t ) S p 1 × R : K S u , t , L S u , t , ( 1 λ ) · K + λ · L S u , t } , K 3 = { ( u , t ) S p 1 × R : K S u , t , L S u , t , ( 1 λ ) · K + λ · L S u , t } .
It is clear that they are disjoint sets. Thus, we have that
inf u S p 1 , t R : ( 1 λ ) · K + λ · L S u , t P ( Γ S u , t ) = inf ( u , t ) K P ( Γ S u , t ) = min { inf ( u , t ) K 1 P ( Γ S u , t ) , inf ( u , t ) K 2 P ( Γ S u , t ) , inf ( u , t ) K 3 P ( Γ S u , t ) } .
Taking into account that
K 1 , K 2 { ( u , t ) S p 1 × R : K S u , t }
and
K 3 { ( u , t ) S p 1 × R : L S u , t } ,
it is obtained that
inf ( u , t ) K 1 P ( Γ S u , t ) inf u S p 1 , t R : K S u , t P ( Γ S u , t ) D C T ( K ; Γ ) , inf ( u , t ) K 2 P ( Γ S u , t ) inf u S p 1 , t R : K S u , t P ( Γ S u , t ) D C T ( K ; Γ ) , inf ( u , t ) K 3 P ( Γ S u , t ) inf u S p 1 , t R : L S u , t P ( Γ S u , t ) D C T ( L ; Γ ) .
Using (12) and (13) and taking into account that K maximizes D C T , we have that
inf u S p 1 , t R : ( 1 λ ) · K + λ · L S u , t P ( Γ S u , t ) D C T ( L ; Γ ) .
Analogously, we obtain
inf u S p 1 , t R : ( 1 λ ) · K + λ · L S u , t + P ( Γ S u , t + ) D C T ( L ; Γ ) .
Thus, D C T ( ( 1 λ ) · K + λ L ; Γ ) D C T ( L ; Γ ) , and D C T satisfies property P3a. □
Proposition 4.
The function D C T does not satisfy P3b with respect to the distance d H .
Proof. 
The proof is by counterexample. Let ( { ω 1 , ω 2 } , P ( { ω 1 , ω 2 } ) , P ) be a probabilistic space such that
P ( ω 1 ) = 3 / 4   and   P ( ω 2 ) = 1 / 4 .
We consider the compact convex random set Γ : Ω K c ( R ) defined by
Γ ( ω 1 ) = [ 1 , 2 ]   and   Γ ( ω 2 ) = [ 2 , 7 ] .
It is clear that
D C T ( Γ ( ω 1 ) ; Γ ) = 3 / 4 ,
and it is the set which maximizes D C T . Let us consider L = [ 3 , 5 ] . We have that
5 = d H ( Γ ( ω 1 ) , Γ ( ω 2 ) ) = d H ( Γ ( ω 1 ) , L ) + d H ( Γ ( ω 2 ) , L ) = 3 + 2 .
Moreover,
D C T ( Γ ( ω 2 ) ; Γ ) = 1 / 4   and   D C T ( L ; Γ ) = P ( s Γ ( 1 ) s C ( 1 ) ) = 0 .
Thus, D C T violates property P3b. □
Notice that the Tukey depth may satisfy Property P3b if the distances between the sets are not measured with the Hausdorff metric, e.g., in the L p -type metrics introduced by Vitale [32].

4.4. Property 4: Vanishing at Infinity

The property in the multivariate case is understood in a geometrical way, considering a sequence { x n } n such that x n [13]. We can also consider a sequence { a + n b } n with a , b R p , such that b 0 , and suppose that the sequence of distances diverges. Thus, in this setting, we also propose two possible properties, the first one from an algebraic point of view and the second one taking into account that the set K c ( R p ) can be viewed as a metric space using the Hausdorff distance.
(P4a.)
Let Γ be a compact convex random set, and let K , L K c ( R p ) be two sets such that K maximizes D ( · ; Γ ) and L { 0 } . Then,
lim n D ( K + n · L ; Γ ) = 0 .
(P4b.)
Let Γ be a compact convex random set, d a metric in K c ( R p ) , K K c ( R p ) a set that maximizes D ( · ; Γ ) and { K n } n a sequence of elements of K c ( R p ) such that lim n d ( K , K n ) = . Then,
lim n D ( K n ; Γ ) = 0 .
Property P4a. parallels the fourth property of the semi-linear depth for fuzzy sets, while P4b. parallels the fourth property of geometric depth for fuzzy sets.
Concerning those properties, we have the following results.
Proposition 5.
The function D C T satisfies P4a. and P4b. with respect to the distance d H .
The following proposition is used in the proof of Proposition 5 for property P4b.
Proposition 6.
Let { K n } n be a sequence of elements of K c ( R p ) such that lim n d H ( K n ; { 0 } ) = . Then, there exists u S p 1 such that
lim n s K n ( u ) = .
Proof. 
It is a basic property of the Hausdorff distance that
d H ( K n , { 0 } ) = sup { x : x K n }
for all n N . The function
f n : K n R
defined by
f n ( x ) = x
is a continuous function defined over a compact convex set, thus it attains its maximum on K n , for all n N . Let us denote by x n the point of K n where f n attains its maximum for every n N . By hypothesis we have that
lim n x n = .
It implies that there exists u S p 1 such that
lim n u , x n = .
By definition of the support function of a compact convex set, we have that
u , x n s K n ( u ) .
Thus, lim n s K n ( u ) = . □
Proof of Proposition 5.
 
Property P4a. Let L { 0 } . There exists u 0 S p 1 such that
s L ( u 0 ) 0 .
Without loss of generality, we assume s L ( u 0 ) > 0 . Clearly, the sequence
{ s K ( u 0 ) + n · s L ( u 0 ) } n
is such that
lim n s K ( u 0 ) + n · s L ( u 0 ) = .
We have that
D C T ( K + · L ; Γ ) P ( s Γ ( u 0 ) s K ( u 0 ) + n · s L ( u 0 ) ) .
If we take limits on both sides
lim n D C T ( K + n · L ; Γ ) lim n P ( s Γ ( u 0 ) s K ( u 0 ) + n · s L ( u 0 ) ) = 0 .
Using Sandwhich’s Rule, we have that lim n D C T ( K + n · L ; Γ ) = 0 .
Property P4b. As the set K is fixed, the condition
lim n d H ( K , K n ) =
is equivalent to
lim n d H ( K n , { 0 } ) = .
From Proposition 6, we have that there exists u 0 S p 1 such that
lim n s K n ( u ) = .
The rest of the proof is analogous to that of Property P4a. □

4.5. Property 5: Upper Semi-Continuity

This property regards a depth as an upper semi-continuous function at every point of its domain. In the multivariate case it is not considered to be a canonical requirement, but continuity properties are studied in different papers, for instance in [13]. This property is considered in the definition of the depth function for functional (metric) spaces [16]. According to [16], a depth D , of a metric space ( E , d ) with respect to a distribution P in the space, is upper semi-continuous if, for all x E and for all ε > 0 , there exists δ > 0 such that
sup y : d ( x , y ) < δ D ( y ; P ) D ( x ; P ) .
The property has not yet being considered in the fuzzy setting.
(P5.)
Let Γ be a compact convex random set, and d be a metric defined over K c ( R p ) . The function D ( · ; Γ ) is upper semi-continuous with respect to the distance d in the sense that
lim sup n D ( K n ; Γ ) D ( K ; Γ )
for every set K K c ( R p ) and every sequence of sets { K n } n such that l i m n d ( K , K n ) = 0 .
Notice that upper semi-continuity implies that the contours of the depth function are closed sets.
Proposition 7.
The function D C T satisfies P5. with respect to the distance d H .
Proof. 
Let Γ be a compact convex random set and K K c ( R p ) be a set, and let { K n } n be a sequence of compact convex sets such that
lim n d H ( K , K n ) = 0 .
We need to prove
lim sup n D C T ( K n ; Γ ) D C T ( K ; Γ ) .
From (2),
d H ( K , K n ) = sup u | s K ( u ) s K n ( u ) |
and then
lim n | s K ( u ) s K n ( u ) | = 0
for each u S p 1 . Thus
lim n s K n ( u ) = s K ( u )
for every u S p 1 . Without loss of generality (the other case is analogous), assume
D C T ( K ; Γ ) = inf u P ( s K ( u ) s Γ ( u ) ) .
Now, we prove that, for all u S p 1 ,
U : = { ω Ω : k N , n k : ω { s K n ( u ) s Γ ( u ) } } { ω Ω : s K ( u ) s Γ ( ω ) ( u ) } .
Let ω U . There exists a sub-sequence { K n } n of { K n } n such that
s K n ( u ) s Γ ( ω ) ( u )
for all n . Taking limits,
s K ( u ) = lim n s K n ( u ) s Γ ( ω ) ( u ) ,
therefore
ω { ω Ω : s K ( u ) s Γ ( u ) } .
By definition, U = lim sup n { s K n ( u ) s Γ ( u ) } . Thus
D C T ( K ; Γ ) = P ( s K ( u ) s Γ ( u ) ) P ( lim sup n { s K n ( u ) s Γ ( u ) } ) lim sup n P ( s K n ( u ) s Γ ( u ) )
where the second inequality is due to the Fatou’s lemma. Taking the infimum on both sides yields
inf u P ( s K ( u ) s Γ ( u ) ) inf u lim sup n P ( s K n ( u ) s Γ ( u ) ) .
Since
lim sup n P ( s K n ( u ) s Γ ( u ) ) = inf n sup k n P ( s K k ( u ) s Γ ( u ) ) ,
it is clear that
inf u inf n sup k n P ( s K k ( u ) s Γ ( u ) ) inf n sup k n inf u P ( s K k ( u ) s Γ ( u ) ) = = lim sup n inf u P ( s K ( u ) s Γ ( u ) ) lim sup n D C T ( K n ; Γ ) .
From (14)–(16), D C T ( · ; Γ ) is upper semi-continuous. □

4.6. Property 6: Consistency

Another desirable property for depth functions is that the sample version converges to the population counterpart (consistency). This property is a particular case of the weak continuity (as a function of the distribution P) property of the axiomatic functional (metric) notion of depth [16], but it is not part of the axiomatic notions of multivariate and fuzzy depth. However, it is generally studied when an instance of depth function is introduced. To the best of our knowledge, the first time that appeared in the literature for depth functions was in Liu [12].
We propose the following property.
(P6.)
Let Γ be a compact convex random set, D ( · ; Γ ) : K c ( R p ) [ 0 , ) a function, and D n ( · ; Γ ) K c ( R p ) [ 0 , ) its sample version. Then, D and D n satsify
sup K K c ( R p ) | D ( K ; Γ ) D n ( K ; Γ ) | 0 , a . s . [ P ] .
This is a uniform consistency requirement which is satisfied by the Tukey depth, but the uniformity may eventually have to be dropped for other depth functions.
Theorem 2.
The function D C T , with D C T , n in (9), satisfies P6.
Proof. 
In terms of measurability, we have that s X 1 ( u ) , , s X n ( u ) is a random sample of the random variable s Γ ( u ) for all u S p 1 . Let us fix K K c ( R p ) . To ease the notation, let us denote
F ( s K ( u ) ) : = { P ( s Γ ( u ) s K ( u ) ) , P ( s Γ ( u ) s K ( u ) ) } ,
F n ( s K ( u ) ) : = { P n u ( ( , s K ( u ) ] ) , P n u ( [ s K ( u ) , ) ) } .
From (7) and (10) and basic properties of the supremum and infimum functions, we have that
| D C T ( K ; Γ ) D C T , n ( K ; Γ ) | = | inf u S p 1 min F ( s K ( u ) ) inf u S p 1 min F n ( s K ( u ) ) | sup u S p 1 | min F ( s K ( u ) ) min F n ( s K ( u ) ) | .
Step 1. Setting
F + ( t , u ) : = P ( s Γ ( u ) t ) F ( t , u ) : = P ( s Γ ( u ) t ) F n + ( t , u ) : = P n u ( ( , t ] ) F n ( t , u ) : = P n u ( [ t , ) )
and applying these again basic properties, we obtain
| D C T ( K ; Γ ) D C T , n ( K ; Γ ) | sup u S p 1 max { | F + ( s K ( u ) , u ) F n + ( s K ( u ) , u ) | , | F ( s K ( u ) , u ) F n ( s K ( u ) , u ) | } .
Then
sup K K c ( R p ) | D C T ( K ; Γ ) D C T , n ( K ; Γ ) | sup K K c ( R p ) sup u S p 1 max { | F + ( s K ( u ) , u ) F n + ( s K ( u ) , u ) | , | F ( s K ( u ) , u ) F n ( s K ( u ) , u ) | } sup u S p 1 sup t R max { | F + ( t , u ) F n + ( t , u ) | , | F ( t , u ) F n ( t , u ) | } .
The Dvoretzky–Kiefer–Wolfowitz inequality ([33], Corollary 1) gives, for each u S p 1 and ε > 0 ,
P ( sup t R | F + ( t , u ) F n + ( t , u ) | > ε ) 2 exp { 2 ε 2 n }
and there easily follows
P ( sup t R | F ( t , u ) F n ( t , u ) | > ε ) 2 exp { 2 ε 2 n } .
Since the bound is independent of u, that implies
P ( sup u S p 1 sup t R max { | F + ( t , u ) F n + ( t , u ) | , | F ( t , u ) F n ( t , u ) | } > ε ) 4 exp { 2 ε 2 n }
which, by the arbitrariness of ε , establishes
sup u S p 1 sup t R max { | F + ( t , u ) F n + ( t , u ) | , | F ( t , u ) F n ( t , u ) | } 0
in probability.
Step 2. To prove almost sure convergence, we rewrite the supremum in terms of an empirical process. Taking
F = { ϕ t , u + , ϕ t , u ( t , u ) R × S p 1 } ,
where ϕ t , u + , ϕ t , u : Ω R are given by
ϕ t , u + ( ω ) = I ( , t ] ( s Γ ( ω ) ) , ϕ t , u ( ω ) = I [ t , ) ( s Γ ( ω ) ) ,
we have
sup u S p 1 sup t R max { | F + ( t , u ) F n + ( t , u ) | , | F ( t , u ) F n ( t , u ) | } = sup ϕ F | E P n ( ϕ ) E P ( ϕ ) | ,
where P n is the empirical distribution. From ([34], Corollary 3.7.9), the above supremum converges to 0 almost surely because it does so in probability (which was proved in Step 1), and the family F has a P-integrable measurable envelope, which is obvious since all functions in F take on values in [ 0 , 1 ] . Accordingly, also
sup K K c ( R p ) | D C T ( K ; Γ ) D C T , n ( K ; Γ ) | 0 , a . s . [ P ] .
 □

4.7. Property 7: Convexity of the Contours

This property is not part of any of the existing axiomatic notions of statistical depth. However, it has been commonly studied in the literature since it first appeared in Donoho and Gasko [35]. In addition, Serfling [36], which focuses on multivariate properties, lists it as a desirable property.
The set K c ( R p ) is endowed with the operation’s sum and product by a scalar. Thus, given U K c ( R p ) , we can say that U is a convex set if
( 1 λ ) · K + λ · L U
for every pair of sets K , L U and for all λ [ 0 , 1 ] . We propose the following property.
(P7.)
Let Γ be a compact convex random set and D ( · ; Γ ) : K c ( R p ) [ 0 , ) a function. Then, the set
D α : = { K K c ( R p ) : D ( K ; Γ ) α } K c ( R p )
is convex for every α [ 0 , 1 ] .
The next result states that the function D C T satisfies the above property, that is, the α -contours of D C T are convex subsets of K c ( R p ) .
Theorem 3.
The function D C T satisfies P7.
Proof. 
Let us fix α [ 0 , 1 ] , K , L D α , and λ [ 0 , 1 ] . The aim is to prove
( 1 λ ) · K + λ · L D α .
For that, we follow the same idea of the proof of Proposition 3. By the definition of Tukey depth,
D C T ( ( 1 λ ) · K + λ · L ; Γ ) = min { inf u S p 1 , t R : ( 1 λ ) · K + λ · L S u , t P ( Γ S u , t ) , inf u S p 1 , t R : ( 1 λ ) · K + λ · L S u , t + P ( Γ S u , t + ) } .
We now prove that
inf u S p 1 , t R : ( 1 λ ) · K + λ · L S u , t P ( Γ S u , t ) α .
As in the proof of Proposition 3, we define the following sets
K : = { ( u , t ) S p 1 × R : ( 1 λ ) · K + λ · K S u , t } , K 1 : = { ( u , t ) S p 1 × R : K , L S u , t , L S u , t } , K 2 : = { ( u , t ) S p 1 × R : K S u , t , L S u , t , ( 1 λ ) · K + λ · L S u , t } , K 3 : = { ( u , t ) S p 1 × R : K S u , t , L S u , t , ( 1 λ ) · K + λ · L S u , t } .
It is clear that
inf ( u , t ) K P ( Γ S u , t ) = min { inf ( u , t ) K 1 P ( Γ S u , t ) , inf ( u , t ) K 2 P ( Γ S u , t ) , inf ( u , t ) K 3 P ( Γ S u , t ) } .
Taking into account (13) and the fact that D C T ( K ; Γ ) , D C T ( L ; Γ ) α , we have that
inf ( u , t ) K i P ( Γ S u , t ) α
for every i { 1 , 2 , 3 } . The case with S u , t + is conducted analogously. Thus,
D C T ( ( 1 λ ) · K + λ · L ; Γ ) α
and D C T ( · ; Γ ) α is a convex set. □

5. Real-Data Application

There are many examples of real interval-valued data. We comment here on some examples that are present in different fields of science where the elements of the dataset are in K c ( R p ) with p > 1 . One of these examples is the Greek wines dataset [37], a real dataset with elements in the space K c ( R 24 ) × R 7 . There, measures of some properties of Greek wines are studied. They include interval-valued variables, such as the mineral ion concentration, the phenol concentrations, or the anthocyanin concentration, and numerical values, such as astringency, sweetness, or acidity.
Another example of compact and convex random sets is about measures related to some tree species [38]. In particular, the maximum and minimum values of the volume of the trunk and of the height of the tree species are measured. Thus, the resulting data are rectangles in R 2 . A third dataset is of compact convex square data related to unemployment in Portugal [39]. It contains measurements of the unemployment period and the period of activity before unemployment for some patients.
The rest of this section is dedicated to computing the Tukey depth of a real dataset made of compact convex sets in R 3 , studying the elements of minimum and maximum depth, and comparing this last one with the Aumann mean and the trimmed Aumann mean.

5.1. Dataset

The dataset studied in what follows is a cardiology dataset comprised of three-dimensional cuboids with the ranges over a day of pulse rate, systolic blood pressure, and diastolic blood pressure of 59 patients. It was collected in 1997 by the Nephrology Unit of the Hospital Valle del Nalón in Langreo, Spain, and it has been applied before in the literature, see, for instance, [40]. For the sake of illustration, the dataset is graphically represented in Figure 1, and part of it is included in Table 1.
From Table 1 we can observe that the dataset consists of 59 rectangular cuboids, in R 3 ; one per patient. We denote each cuboid by
C i : = [ m P i , M P i ] × [ m S i , M S i ] × [ m D i , M D i ]
for i = 1 , , 59 . There,
  • [ m P i , M P i ] denotes the range of blood pulse over a day of patient i , with m P i being the minimal value and M P i the largest,
  • [ m S i , M S i ] the range of systolic blood pressure over the same day of patient i and
  • [ m D i , M D i ] the same but for diastolic blood pressure.
As observable from Table 1,
C 1 = [ 58 , 90 ] × [ 118 , 173 ] × [ 63 , 102 ]
for instance. Each cuboid C i is also represented by its eight vertices, which are points in R 3 . With the above notation, these vertices are
( m P i , m S i , m D i ) , ( m P i , m S i , M D i ) , ( m P i , M S i , m D i ) , ( m P i , M S i , M D i ) , ( M P i , m S i , m D i ) , ( M P i , m S i , M D i ) , ( M P i , M S i , m D i ) and ( M P i , M S i , M D i ) .

5.2. Tukey Depth Computation

Let us denote by T the compact convex random set corresponding to the empirical distribution of { C i } i = 1 59 ; that is, each cuboid has the probability given by its relative frequency in the dataset, in our case 1 / 59 . Additionally, let us denote using
V 1 , , V 8
the multivariate random variables corresponding to the empirical distribution associated with
{ ( m P i , m S i , m D i ) } i = 1 59 , { ( m P i , m S i , M D i ) } i = 1 59 , { ( m P i , M S i , m D i ) } i = 1 59 , { ( m P i , M S i , M D i ) } i = 1 59 , { ( M P i , m S i , m D i ) } i = 1 59 , { ( M P i , m S i , M D i ) } i = 1 59 , { ( M P i , M S i , m D i ) } i = 1 59 and { ( M P i , M S i , M D i ) } i = 1 59 , respectively .
To compute the Tukey depth of each cuboid in the dataset, it suffices to calculate the minimum of the multivariate Tukey depth in R 3 of each vertex of the cuboid. Thus, given a cuboid C i , its Tukey depth with respect to T is
D C T ( C i , T ) = min { H D ( ( m P i , m S i , m D i ) ; V 1 ) , H D ( ( m P i , m S i , M D i ) ; V 2 ) , H D ( ( m P i , M S i , m D i ) ; V 3 ) , H D ( ( m P i , M S i , M D i ) ; V 4 ) H D ( ( M P i , m S i , m D i ) ; V 5 ) , H D ( ( M P i , m S i , M D i ) ; V 6 ) H D ( ( M P i , M S i , m D i ) ; V 7 ) , H D ( ( M P i , M S i , M D i ) ; V 8 ) } ,
where H D ( x ; V ) denotes the multivariate halfspace depth of x R 3 with respect to V .
Table 2 provides the obtained depth values for each element in the dataset, that is, the values { D C T ( C i ; T ) } i = 1 59 . Taking into account these values, we have that the element C 1 in (17) has the maximum depth, it is the deepest one, and the elements in the following set have minimum depths
{ C 2 , C 3 , C 4 , C 6 , C 9 , C 10 , C 12 , C 13 , C 15 , C 17 , C 19 , C 20 , C 23 , C 24 , C 25 , C 27 , C 28 , C 29 , C 30 , C 31 , C 34 , C 35 , C 38 , C 39 , C 40 , C 41 , C 42 , C 44 , C 49 , C 50 , C 51 , C 53 , C 55 , C 56 , C 58 , C 59 } .
To display this information, Figure 2 represents the sets of maximum and minimum depth. In particular, the left panel of the Figure represents the five deeper cuboids, with the sets of maximum depth colored in red. Meanwhile, the right panel of the Figure represents the sets with minimum depth in color blue. That is, those in (18). In addition, the right panel of the figure also displays C 1 , the cuboid with maximum depth, in red. This is completed in order to visualize that the ordering given by the Tukey depth is natural, and the element C 1 is the deepest set with respect to the cloud of cuboids.
One may think that it is possible to compute the Tukey depth of each cuboid by considering the variables Pulse, Systolic, and Diastolic separately. Let P , S , and D denote the compact convex random sets corresponding to the empirical distribution associated with
{ [ m P i , M P i ] } i = 1 59 , { [ m S i , M S i ] } i = 1 59 and { [ m D i , M D i ] } i = 1 59 ,
respectively. Additionally, let us denote by
m P , M P , m S , M S , m D and M D
the real random variables corresponding to the empirical distribution associated with
{ m P i } i = 1 59 , { M P i } i = 1 59 , { m S i } i = 1 59 , { M S i } i = 1 59 , { m D i } i = 1 59 , and { M D i } i = 1 59 ,
respectively. Given an index i { 1 , , 59 } , the Tukey depth of the i-th interval element with respect to P , S , and D are
D C T ( [ m P i , M P i ] ; P ) = min { H D ( m P i ; m P ) , H D ( M P i ; M P ) } , D C T ( [ m S i , M S i ] ; S ) = min { H D ( m S i ; m S ) , H D ( M S i ; M S ) } , and D C T ( [ m D i , M D i ] ; D ) = min { H D ( m D i ; m D ) , H D ( M D i ; M D ) } .
The element with the maximum depth with respect to P is the 48-th element, which has a depth value of 0.08474 with respect to T . The elements with maximum depths with respect to S and D are the 28-th and 19-th element, respectively, which have minimum depth values with respect to T . Thus, it is clear that we must consider all three variables simultaneously.
The calculation of the Tukey depth breaks the dataset into an outer layer of 36 patients with depth 1 / 59 , which envelopes an inner core of 23 patients with higher depth. The depth value 1 / 59 means that, taking the support function in a certain direction in R 3 , the point is separated from the remainder of the data. Since each direction represents a linear combination of all three variables, there is some combination of weights for the variables which distinguishes that patient from all others. That suggests that many different patterns of behavior between the three variables are within the ordinary.

5.3. Aumann Mean

We first compute the Aumann mean, μ ^ A , for the complete dataset. The Aumann mean is a generalization of the real-valued mean that works for compact convex sets. We then compare it with the Aumann mean of the dataset after removing the cuboids with minimum depth, μ ^ t A . The Aumann mean of the complete dataset is
μ ^ A = 1 59 i = 1 59 m P i , 1 59 i = 1 59 M P i × 1 59 i = 1 59 m S i , 1 59 i = 1 59 M S i × 1 59 i = 1 59 m D i , 1 59 i = 1 59 M D i = [ 53.97 , 95.07 ] × [ 111.83 , 181.58 ] × [ 58.64 , 108.25 ] .
When we consider the inner core of the dataset by removing the set of cuboids with minimal depth (set in Equation (18)), the Aumann mean becomes
μ ^ t A = [ 54.34783 , 91.47826 ] × [ 112.2174 , 178.4348 ] × [ 59.82609 , 107.4783 ] .
This is conceptually similar to a trimmed mean (but trims more than half of the sample). The mean values are very similar, meaning that data in the outer layer have a similar average behavior to those in the inner core, and their outlier nature exerts little influence. In that situation, one would expect that the deepest point to be close to those means, and indeed the maximal depth in the sample is reached at C 1 = [ 58 , 90 ] × [ 118 , 173 ] × [ 63 , 102 ] , which is also very similar albeit the intervals are a bit narrower.
We have that both means, μ ^ A and μ ^ t A , have similar values in every variable. This can be explained by the fact that some linear combination between the elements with minimal depth exists that distinguishes them from the rest of the dataset, but this does not affect the mean. Note that the set with maximal depth, C 1 = [ 58 , 90 ] × [ 118 , 173 ] × [ 63 , 102 ] , is also very similar to the above means.

6. Discussion

Considering the properties studied in the literature for depth functions, we propose nine different properties for depth functions with respect to compact convex random sets. They are:
  • P1. Affine invariance,
  • P2. Maximality at the center of symmetry,
  • P3a. Monotonicity with respect to the center in an algebraic way,
  • P3b. Monotonicity with respect to the center in relation to the associated distance (in a geometric way),
  • P4a. Vanishing at infinity in an algebraic way,
  • P4b. Vanishing at infinity in a geometric way,
  • P5. Upper semi-continuity,
  • P6. Consistency, and
  • P7. Convexity of the contours.
It is clear that all of them are desirable properties for a depth function of compact convex sets. However, not all of them have to be part of an axiomatic definition. For instance, it seems appropriate to have either P3a. and P4a. or P3b. and P4b. At the same time, P7., although important, does not belong to any of the existing axiomatic definitions, and P5. and a general case of P6. only belong to the functional (metric) axiomatic definition of statistical depth.
Taking all of this into account, we propose to consider:
  • the algebraic depth of compact convex sets, when properties P1., P2., P3a., and P4a. are satisfied;
  • the restricted algebraic depth of compact convex sets, when properties P1., P2., P3a., P4a., P5., P6., and P7. are satisfied;
  • the geometric depth of compact convex sets, when properties P1., P2., P3b., and P4b. are satisfied; and
  • the restricted geometric depth of compact convex sets, when properties P1., P2., P3b., P4b., P5., P6., and P7. are satisfied.
Note that the algebraic depth can be considered to be an adaptation of the notions of multivariate depth and of semi-linear fuzzy depth. Meanwhile, the geometric depth can be seen as a conversion of the geometric fuzzy depth and the restricted geometric depth as a modification of the functional (metric) depth.
We have studied the satisfaction of the above properties for the Tukey depth of compact convex sets, which is an adaptation of this setting of the multivariate Tukey depth and a simplification of the Tukey for fuzzy sets. It happens that this depth function satisfies all of these properties but for P3b., for which we have provided a counterexample. Thus, the Tukey depth of compact convex sets is a restricted algebraic depth and, in particular, an algebraic depth. However, it is not a geometric depth, and, consequently, neither is it a restricted geometric depth.
Cascos et al. [24] proposed a notion of depth for random closed sets. They require properties P1, P5 (for the Fell topology instead of the Hausdorff metric), and the property that a degenerate random set should assign depth 1 to its only value and 0 to any other random set. Admitting unbounded sets as values leads to some defining properties of depth being hard to adapt; a situation they solve by opting for a minimal list of properties. It is worth mentioning that, in the case of compact convex values, convergence in the Fell topology and in the Hausdorff metric are equivalent ([41], Corollary 3A). Hence, both upper semi-continuity requirements are equivalent for the Tukey depth, and Proposition 7 provides a proof of upper semi-continuity with respect to the Fell topology. Such a proof is missing in [24] on the grounds of it being ‘easy’ (a direct proof without invoking extra facts does not seem to be that easy).

Author Contributions

Writing—original draft preparation, L.G.-D.L.F., A.N.-R., and P.T.; supervision, A.N.-R.; funding acquisition, L.G.-D.L.F. and A.N.-R. All authors have read and agreed to the published version of the manuscript.

Funding

For L.G.-D.L.F. and A.N.-R., this research was supported by grant MTM2017-86061-C2-2-P funded by MCIN/AEI/10.13039/501100011033 and “ERDF A way of making Europe”. P.T. was supported by the Ministerio de Economía y Competitividad grant MTM2015-63971-P, the Ministerio de Ciencia, Innovación y Universidades grant PID2019-104486GB-I00, and the Consejería de Empleo, Industria y Turismo del Principado de Asturias grant GRUPIN-IDI2018-000132.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The studied dataset is available at http://bellman.ciencias.uniovi.es/SMIRE/Hospital.html.

Acknowledgments

We are grateful to the SMIRE–CODIRE group for making their cardiology dataset publicly available on their website.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Gil, M.Á.; Lubiano, M.A.; Montenegro, M.; López, M.T. Least squares fitting of an affine function and strength of association for interval-valued data. Metrika 2002, 56, 97–111. [Google Scholar] [CrossRef]
  2. de Lima Neta, E.A.; de Carvalho, F.A.T. Nonlinear regression applied to interval-valued data. Patt. Anal. Appl. 2017, 20, 809–824. [Google Scholar] [CrossRef]
  3. Molchanov, I. Theory of Random Sets, 2nd ed.; Springer: London, UK, 2017. [Google Scholar]
  4. Artstein, Z.; Vitale, R.A. A strong law of large numbers for random compact convex sets. Ann. Probab. 1975, 3, 879–882. [Google Scholar] [CrossRef]
  5. González-Rodríguez, G.; Blanco, A.; Corral, N.; Colubi, A. Least squares estimation of linear regression models for convex compact convex random sets. Adv. Data Anal. Classif. 2007, 1, 67–81. [Google Scholar] [CrossRef]
  6. Sinova, B.; Casals, M.R.; Colubi, A.; Gil, M.Á. The median of a random interval. In Combining Soft Computing and Statistical Methods in Data Analysis; Springer: Berlin/Heidelberg, Germany, 2010; pp. 575–583. [Google Scholar]
  7. Richey, J.; Sarkar, A. Intersections of random sets. J. Appl. Probab. 2022, 59, 131–151. [Google Scholar] [CrossRef]
  8. Shi, P.; Lu, L.; Fan, X.; Xin, Y.; Ni, J. A novel underwater sonar image enhancement algorithm based on approximation spaces of random sets. Multimed. Tools. Appl. 2022, 81, 4569–4584. [Google Scholar] [CrossRef]
  9. Jörnsten, R. Clustering and classification based on the L1 data depth. J. Multivar. Anal. 2004, 90, 67–89. [Google Scholar] [CrossRef]
  10. Nieto-Reyes, A.; Battey, H.; Francisci, G. Functional Symmetry and Statistical Depth for the Analysis of Movement Patterns in Alzheimer’s Patients. Mathematics 2021, 9, 820. [Google Scholar] [CrossRef]
  11. Nieto-Reyes, A.; Duque, R.; Francisci, G. A Method to Automate the Prediction of Student Academic Performance from Early Stages of the Course. Mathematics 2021, 9, 2677. [Google Scholar] [CrossRef]
  12. Liu, R.Y. On a notion of data depth based on random simplices. Ann. Stat. 1990, 18, 405–414. [Google Scholar] [CrossRef]
  13. Zuo, Y.; Serfling, R. General notions of statistical depth function. Ann. Stat. 2000, 28, 461–482. [Google Scholar]
  14. Nieto-Reyes, A.; Battey, H. A topologically valid construction of depth for functional data. J. Multivar. Anal. 2021, 184, 104738. [Google Scholar] [CrossRef]
  15. Gónzalez-de la Fuente, L.; Nieto-Reyes, A.; Terán, P. Statistical depth for fuzzy sets. Fuzzy Sets Syst. 2022, 443 Pt A, 58–86. [Google Scholar] [CrossRef]
  16. Nieto-Reyes, A.; Battey, H. A topologically valid definition of depth for functional data. Stat. Sci. 2016, 31, 61–79. [Google Scholar] [CrossRef]
  17. Gónzalez-de la Fuente, L.; Nieto-Reyes, A.; Terán, P. Two notions of depth in the fuzzy setting. In Building Bridges between Soft and Statistical Methodologies for Data Science; García-Escudero, L., Gordaliza, A., Mayo, A., Gomez, M.A.L., Gil, M.A., Grzegorzewski, P., Hryniewicz, O., Eds.; Springer Cham: Berlin/Heidelberg, Germany, 2023; to appear. [Google Scholar]
  18. Tukey, J.W. Mathematics and Picturing Data. In Proceedings of the International Congress of Mathematicians, Vancouver, BC, Canada, 21–29 August 1974; Canadian Mathematical Congress: Montreal, QC, Canada, 1975; pp. 523–531. [Google Scholar]
  19. Serfling, R. A depth function and a scale curve based on spatial quantiles. In Statistical Data Analysis Based on L1-norm and Related Methods; Dodge, Y., Ed.; Birkhäuser: Basel, Germany, 2002; pp. 25–38. [Google Scholar]
  20. Cuesta-Albertos, J.A.; Nieto-Reyes, A. The random Tukey depth. Comput. Stat. Data Anal. 2008, 52, 4979–4988. [Google Scholar] [CrossRef]
  21. Chakraborty, A.; Chaudhuri, P. The spatial distribution in infinite dimensional spaces and related quantiles and depths. Ann. Stat. 2014, 42, 1203–1231. [Google Scholar] [CrossRef] [Green Version]
  22. Cuesta-Albertos, J.A.; Nieto-Reyes, A. Functional classification and the random Tukey depth. Practical issues. In Combining Soft Computing and Statistical Methods in Data Analysis; Borgelt, C., González-Rodríguez, G., Trutsching, W., Lubiano, M.A., Gil, M.A., Grzegorzewski, P., Hryniewicz, O., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; Volume 77, pp. 123–130. [Google Scholar]
  23. Gónzalez-de la Fuente, L.; Nieto-Reyes, A.; Terán, P. Tukey depth for fuzzy sets. In Building Bridges between Soft and Statistical Methodologies for Data Science; García-Escudero, L., Gordaliza, A., Mayo, A., Gomez, M.A.L., Gil, M.A., Grzegorzewski, P., Hryniewicz, O., Eds.; Springer Cham: Berlin/Heidelberg, Germany, 2023; to appear. [Google Scholar]
  24. Cascos, I.; Li, Q.; Molchanov, I. Depth and outliers for samples of sets and random sets distributions. Aust. N. Z. Stat. 2021, 63, 55–82. [Google Scholar] [CrossRef]
  25. Matheron, G. Random Sets and Integral Geometry; Wiley: New York, NY, USA, 1975. [Google Scholar]
  26. Himmelberg, C. Measurable relations. Fund. Math. 1974, 87, 53–72. [Google Scholar] [CrossRef] [Green Version]
  27. Bonnensen, T.; Fenchel, W. Theorie der Konvexen Korper; Chelsea: New York, NY, USA, 1948. [Google Scholar]
  28. Zadeh, L.A. The concept of a linguistic variable and its application to approximate reasoning, Part 1. Inform. Sci. 1975, 8, 199–249. [Google Scholar] [CrossRef]
  29. Zadeh, L.A. The concept of a linguistic variable and its application to approximate reasoning, Part 2. Inform. Sci. 1975, 8, 301–353. [Google Scholar] [CrossRef]
  30. Zadeh, L.A. The concept of a linguistic variable and its application to approximate reasoning, Part 3. Inform. Sci. 1975, 8, 43–80. [Google Scholar] [CrossRef]
  31. Gruber, P.M.; Lettl, G. Isometries of the Space of Convex Bodies in Euclidean Space. Bull. Lond. Math. Soc. 1980, 12, 455–462. [Google Scholar] [CrossRef]
  32. Vitale, R.A. Lp metrics for compact, convex sets. J. Approx. Theory 1985, 45, 280–287. [Google Scholar] [CrossRef] [Green Version]
  33. Massart, P. The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality. Ann. Probab. 1990, 18, 1269–1283. [Google Scholar] [CrossRef]
  34. Giné, E.; Nickl, R. Mathematical Foundations of Infinite-Dimensional Statistical Models; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
  35. Donoho, D.L.; Gasko, M. Breakdown properties of location estimates based on halfspace depth and projected outlyinges. Ann. Stat. 1992, 20, 1803–1827. [Google Scholar] [CrossRef]
  36. Serfling, R. Depth Functions in Nonparametric Multivariate Inference. In Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications; DIMACS Series in Discrete Mathematics and Theoretical Computer Science; AMS: New Brunswick, NJ, USA, 2006. [Google Scholar]
  37. Kallithrakaa, S.; Arvanitoyannis, I.; Kefalasa, P.; El-Zajoulia, A.; Soufleros, E.; Psarra, E. Instrumental and sensory analysis of Greek wines; implementation of principal component analysis (PCA) for classification according to geographical origin. Food Chem. 2001, 73, 501–514. [Google Scholar] [CrossRef]
  38. da Silva, J.A.A.; Cordeiro, G.M.; Ferreira, R.L.C. Modeling the growth of eucalyptus clones using the chapman-richards model with different symmetrical error distributions. Ciência Florest. 2012, 22, 777–785. [Google Scholar]
  39. Dias, S.; Brito, P. Off the beaten track: A new linear model for interval data. Eur. J. Oper. Res. 2017, 258, 1118–1130. [Google Scholar] [CrossRef] [Green Version]
  40. Lubiano, M.A. Medidas de Variación de Elementos Aleatorios Imprecisos. Ph.D. Thesis, University of Oviedo, Oviedo, Spain, 1999. [Google Scholar]
  41. Salinetti, G.; Wets, R.J.B. On the convergence of sequences of convex sets in finite dimensions. SIAM Rev. 1979, 21, 18–33. [Google Scholar] [CrossRef]
Figure 1. Representation of the cardiology three-dimensional cuboid dataset. The x-axes represent, for each patient, the range of the blood pulse over a same day, the y-axes the range of the systolic blood pressure over the same day, and the z-axes the range of the diastolic blood pressure over the same day. There are a total of 59 patients, with one cuboid per patient.
Figure 1. Representation of the cardiology three-dimensional cuboid dataset. The x-axes represent, for each patient, the range of the blood pulse over a same day, the y-axes the range of the systolic blood pressure over the same day, and the z-axes the range of the diastolic blood pressure over the same day. There are a total of 59 patients, with one cuboid per patient.
Mathematics 10 02758 g001
Figure 2. Representation of the sets with maximum and minimum depths. The left panel represents the five sets of maximum depth with the deepest one, C 1 , in red. The right panel represents the sets with minimum depth, in (18), and again the set C 1 in red.
Figure 2. Representation of the sets with maximum and minimum depths. The left panel represents the five sets of maximum depth with the deepest one, C 1 , in red. The right panel represents the sets with minimum depth, in (18), and again the set C 1 in red.
Mathematics 10 02758 g002
Table 1. Cardiology three-dimensional cuboid dataset for some patients. Columns 2 and 6, named Pulse, contain the range of blood pulse over a day for each patient, labelled by an identification number (ID) in columns 1 and 5. Columns 3 and 7, named Systolic, provide the range of systolic blood pressure over the same day per patient. Columns 4 and 8, named Diastolic, display the range of diastolic blood pressure over the same day per patient.
Table 1. Cardiology three-dimensional cuboid dataset for some patients. Columns 2 and 6, named Pulse, contain the range of blood pulse over a day for each patient, labelled by an identification number (ID) in columns 1 and 5. Columns 3 and 7, named Systolic, provide the range of systolic blood pressure over the same day per patient. Columns 4 and 8, named Diastolic, display the range of diastolic blood pressure over the same day per patient.
IDPulseSystolicDiastolicIDPulseSystolicDiastolic
1 58 90 118 173 63 102 31 52 78 119 212 47 93
2 47 68 104 161 71 118 32 55 84 122 178 73 105
28 71 121 113 176 57 95 58 56 97 92 173 45 107
29 68 91 114 186 46 103 59 37 86 83 140 45 91
30 62 100 145 210 100 136
Table 2. Tukey depth value of each element in the cardiology three-dimensional cuboid dataset.
Table 2. Tukey depth value of each element in the cardiology three-dimensional cuboid dataset.
IDDepth ValueIDDepth Value
10.15254310.01694
20.01694320.06779
30.01694330.05084
40.01694340.01694
50.03389350.01694
60.01694360.03389
70.08474370.03389
80.03389380.01694
90.01694390.01694
100.01694400.01694
110.05084410.01694
120.01694420.01694
130.01694430.03389
140.13559440.01694
150.01694450.08474
160.03389460.10169
170.01694470.10169
180.03389480.08474
190.01694490.01694
200.01694500.01694
210.03389510.01694
220.03389520.10169
230.01694530.01694
240.01694540.03389
250.01694550.01694
260.10169560.01694
270.01694570.03389
280.01694580.01694
290.01694590.01694
300.01694
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

González-De La Fuente, L.; Nieto-Reyes, A.; Terán, P. Properties of Statistical Depth with Respect to Compact Convex Random Sets: The Tukey Depth. Mathematics 2022, 10, 2758. https://doi.org/10.3390/math10152758

AMA Style

González-De La Fuente L, Nieto-Reyes A, Terán P. Properties of Statistical Depth with Respect to Compact Convex Random Sets: The Tukey Depth. Mathematics. 2022; 10(15):2758. https://doi.org/10.3390/math10152758

Chicago/Turabian Style

González-De La Fuente, Luis, Alicia Nieto-Reyes, and Pedro Terán. 2022. "Properties of Statistical Depth with Respect to Compact Convex Random Sets: The Tukey Depth" Mathematics 10, no. 15: 2758. https://doi.org/10.3390/math10152758

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop