A Comparison between Wasserstein Distance and a Distance Induced by Fisher-Rao Metric in Complex Shapes Clustering

Sanctis, Angela De; Gattone, Stefano A.

doi:10.3390/ecea-4-05016

Open AccessProceeding Paper

A Comparison between Wasserstein Distance and a Distance Induced by Fisher-Rao Metric in Complex Shapes Clustering^†

by

Angela De Sanctis

^1,*,‡ and

Stefano A. Gattone

²

¹

Department of Management and Business Administration, University “G. d’Annunzio” of Chieti-Pescara, 66100 Chieti, Italy

²

Department of Philosophical, Pedagogical and Quantitative Economic Sciences, University “G. d’Annunzio” of Chieti-Pescara, 66100 Chieti, Italy

^*

Author to whom correspondence should be addressed.

^†

Presented at the 4th International Electronic Conference on Entropy and Its Applications, 21 November–1 December 2017; Available online: http://sciforum.net/conference/ecea-4

^‡

These authors contributed equally to this work.

Proceedings 2018, 2(4), 163; https://doi.org/10.3390/ecea-4-05016

Published: 20 November 2017

(This article belongs to the Proceedings of The 4th International Electronic Conference on Entropy and Its Applications)

Download Versions Notes

Abstract

:

Shape Analysis studies geometrical objects, as for example a flat fish in the plane or a human head in the space. The applications range from structural biology, computer vision, medical imaging to archaeology. We focus on the selection of an appropriate measurement of distance among observations with the aim of obtaining an unsupervised classification of shapes. Data from a shape are often realized as a set of representative points, called landmarks. For planar shapes, we assume that each landmark is modeled via a bivariate Gaussian, where the means capture uncertainties that arise in landmarks placement and the variances the natural variability across the population of shapes. At first we consider the Fisher-Rao metric as a Riemannian metric on the Statistical Manifold of the Gaussian distributions. The induced geodesic-distance is related with the minimization of information in the Fisher sense and we can use it to discriminate shapes. Another suitable distance is the Wasserstein distance, which is induced by a Riemannian metric and is related with the minimal transportation cost. In this work, a simulation study is conducted in order to make a comparison between Wasserstein and Fisher-Rao metrics when used in shapes clustering.

Keywords:

information geometry; Fisher-Rao metric; Wasserstein metric; landmarks; Shape Analysis; Cluster Analysis

1. Introduction

Shapes clustering is of interest in various fields such as geometric morphometrics, computer vision and medical imaging. In the clustering of shapes is crucial to find an appropriate measurement of distance among observations. In particular we are interested to classify shapes which derive from complex systems as expression of self-organization phenomenon. We consider objects whose shapes are based on landmarks ([1,2,3]). These objects can be obtained by medical imaging procedures, curves defined by manually or automatically assigned feature points or by a discrete sampling of the object contours.

Since the shape space is invariant under similarity transformations, that is translations, rotations and scaling, an Euclidean distance function on such a space is not really meaningful. In Shape Analysis [4], in order to apply standard clustering algorithms to planar shapes, the Euclidean metric has to be replaced by the metric of the shape space. Examples are provided in [5,6] where the Procrustes distance was integrated in standard clustering algorithms such as the k-means. Similarly, Lele et al. [7] applied standard hierarchical or k-means clustering using dissimilarity measures based on the inter-landmark distances. In a model-based clustering framework [8,9] developed a mixture model of offset-normal shape distributions.

Considering for simplicity a planar shape of a population, we assume that each landmark is modeled via a bivariate Gaussian, where the means are the landmark geometric coordinates and capture uncertainties that arise in landmark placement while the variances derived from the natural variability across the population of shapes. According to Information Geometry, we consider the space of bivariate Gaussian densities as a Statistical Manifold ([10,11]) with the local coordinates defined by the model parameters. Next, we define distances between objects related with different Riemannian metrics. These distances are induced by the geodesics of the two metrics (geodesic distances). Applications of geodesics to shape clustering techniques are provided, in a landmark-free context, by [12,13].

At first we consider the Fisher-Rao metric as a Riemannian metric on the Statistical Manifold of the Gaussian densities. The induced geodesic-distance is related with the minimization of information in the Fisher sense and we can use it to define a shape distance.

Another suitable distance is the Wasserstein distance, which is induced by a Riemannian metric and is related with the minimal transportation cost.

The geodesic distances induced by Wasserstein and Fisher-Rao metrics can be used to discriminate shapes. The discriminative power of these shapes distances will be evaluated, in the setting of shapes clustering on simulated data.

2. The Method

Suppose we are given a planar shape configuration, C, consisting of a fixed number K of labeled landmarks

C = \{μ_{1}, μ_{2}, \dots, μ_{K}\}

with generic element

μ_{k} = \{μ_{k 1}, μ_{k 2},\}

for

k = 1, \dots, K

.

Following [14], the k-th landmark may be represented by a bivariate Gaussian density as follows:

f (x; μ_{k}, Σ_{k}) = {(2 π)}^{- 1} {| Σ_{k} |}^{- \frac{1}{2}} exp \{- \frac{1}{2} {(x - μ_{k})}^{'} Σ_{k}^{- 1} (x - μ_{k})\}

(1)

with

x

being a generic 2-dimensional vector and

Σ_{k}

given by

Σ_{k} = σ_{k}^{2} I_{2} = diag (σ_{k 1}^{2}, σ_{k 2}^{2})

(2)

where

{σ_{k 1}^{2}

,

σ_{k 2}^{2}}

is the vector of the variances in the horizontal and vertical directions of the k-th landmark coordinates, for

k = 1, \dots, K

. We remark that the means capture uncertainties that arise in landmark placement and the variances the natural variability across a population of shapes.

From Equation (1) we can assign to the k-th landmark a new set of coordinates given by

θ_{k} = (μ_{k}, σ_{k})

on the 4-dimensional manifold which is the product of two upper half planes. So two planar shapes S and

S^{'}

can be parametrized as follows:

S = (θ_{1}, \dots, θ_{K})

and

S^{'} = (θ_{1}^{'}, \dots, θ_{K}^{'})

. For every k, let

γ_{k} (t)

with

t \in [0, 1]

be a path of the manifold such that

γ_{k} (0) = θ_{k}

and

γ_{k} (1) = θ_{k}^{'}

. From differential geometry we know that a given Riemannian metric

g

induces an inner product

< ., . >_{g}

on the tangent space of the manifold such that the length of

γ_{k} (t)

is defined as follows

l (γ_{k}) = \int_{0}^{1} {∥ {\dot{γ}}_{k} (t) ∥}_{g}^{2} d t

(3)

The distance between the k-th landmarks of the two shapes is given by the minimum length of the trajectories

γ_{k} (t)

(geodesic distance)

d_{g} (θ_{k}, θ_{k}^{'}) = inf_{γ_{k}} {\sqrt{l (γ_{k})} : γ_{k} (0) = θ_{k}, γ_{k} (1) = θ_{k}^{'}} .

(4)

Finally, the sum of the distances between each pair of landmarks is used to define a distance between two shapes S and

S^{'}

[15].

In the statistical manifold of bivariate Gaussian densities, we will consider two different Riemannian metrics which in turn induce two types of geodesic distances.

One is the Fisher-Rao metric

g_{f}

. For this metric, the closed form of the geodesic distance between two densities with diagonal covariance matrices is available and given by [16]:

\begin{matrix} d_{g_{f}} ((μ_{11}, σ_{11}, μ_{12}, σ_{12}), (μ_{21}, σ_{21}, μ_{22}, σ_{22})) & = \\ \sqrt{2 \sum_{i = 1}^{2} {(ln \frac{| (\frac{μ_{1 i}}{\sqrt{2}}, σ_{1 i}) - (\frac{μ_{2 i}}{\sqrt{2}}, - σ_{2 i}) | + | (\frac{μ_{1 i}}{\sqrt{2}}, σ_{1 i}) - (\frac{μ_{2 i}}{\sqrt{2}}, σ_{2 i}) |}{| (\frac{μ_{1 i}}{\sqrt{2}}, σ_{1 i}) - (\frac{μ_{2 i}}{\sqrt{2}}, - σ_{2 i}) | - | (\frac{μ_{1 i}}{\sqrt{2}}, σ_{1 i}) - (\frac{μ_{2 i}}{\sqrt{2}}, σ_{2 i}) |})}^{2}} . \end{matrix}

(5)

For Gaussian densities with

Σ

being any symmetric positive definite covariance matrix, a closed form for the associated distance is not available.

The other Riemannian metric we consider is

g_{w}

, which induces the Wasserstein distance ([17]). For Gaussian densities the explicit expression of the Wasserstein distance is the following:

d_{g_{w}} (θ, θ^{'}) = ∥ μ - μ^{'} ∥ + t r (Σ) + t r (Σ^{'}) - 2 t r (\sqrt{Σ^{\frac{1}{2}} Σ^{'} Σ^{\frac{1}{2}}})

(6)

where

∥ . ∥

is the euclidean norm and

Σ^{\frac{1}{2}}

is defined for a symmetric positive definite matrix

Σ

so that

Σ^{\frac{1}{2}} \cdot Σ^{\frac{1}{2}} = Σ

.

Otto et al. [18] proved that, with respect to the Riemannian metric which induces the Wasserstein distance, the manifold of Gaussian densities has non-negative sectional curvature. We deduce that the Wasserstein metric is different from the Fisher-Rao metric. For example in the univariate case, it is well known that the statistical manifold of Gaussian densities with the Fisher-Rao metric can be regarded as the upper half plane with the hyperbolic metric, which has negative curvature.

3. A Simulation Experiment

In this section we report the results of a simple simulation experiment. In order to test the discriminative power of the proposed shape distances we first simulate shapes from two different mean shapes. The Fisher-Rao distance and the Wasserstein distance are evaluated between each pair of shapes and stored in two different pairwise distance matrices. Then we run a hierarchical cluster algorithm which takes as input the pairwise distance matrices computed with the two shapes distances. The quality of the clusters identified with the two shapes distances is measured by means of the Adjusted Rand index ([19]).

The shapes are simulated from the following Gaussian perturbation model

X_{i h} = (μ_{h} + E_{i}) Γ_{i} + 1_{K} γ_{i}^{T}

(7)

where

$E_{i}$ are zero mean $K \times 2$ random error matrices simulated from the multivariate Normal distribution with covariance structure $Σ_{E}$
$μ_{h}$ is the mean shape for cluster h
$Γ_{i}$ is an orthogonal rotation matrix with an angle $θ$ uniformly produced in the range $[0, 2 π]$
$γ_{i}^{T}$ is a $1 \times 2$ uniform translation vector in the range $[- 2, 2]$ .

Two different covariance structures are considered:

Isotropic with $Σ_{E} = σ I_{K} \otimes σ I_{2}$ with independent spherical variation around each mean landmark
Heteroscedastic with $Σ_{E} = diag [σ_{1}, σ_{2} \dots, σ_{K}] \otimes σ I_{2}$ with a heteroscedastic variation around each mean landmark.

The number of clusters was set to

h = 2

and the number of configurations to 30. The mean shapes in the two clusters were taken from the mean skull of 21 rats collected at ages of 7 and 14 days (the rat calvarial data set, [1]). In the isotropic case, two values of

σ

were used:

σ = 10

(small error) and

σ = 13

(high error). The heteroscedastic case was simulated by multiplying the value of

σ = 10

of 3 landmarks (out of

K = 8

landmarks) by a factor of

1.69

(small error) and 3 (high error). Results from 500 random simulations are reported in Table 1.

Results show that the Wasserstein distance has a good performance especially in the Isotropic case. As the variability around the landmarks becomes heteroscedastic then the Fisher-Rao distance performs better (Heteroscedastic-high error case).

References

Bookstein, F.L. Morphometric Tools for Landmark Data: Geometry and Biology; Cambridge University Press: Cambridge, UK, 1991. [Google Scholar]
Cootes, S.; Taylor, C.; Cooper, D.; Graham, J. Active shape models-their training and application. Comput. Vis. Image Underst. 1995, 61, 38–59. [Google Scholar] [CrossRef]
Kendall, D.G. Shape manifolds, Procrustean metrics and complex projective spaces. Bull. Lond. Math. Soc. 1984, 16, 81–121. [Google Scholar] [CrossRef]
Dryden, I.L.; Mardia, K.V. Statistical Shape Analysis; John Wiley & Sons: London, UK, 1998. [Google Scholar]
Amaral, G.; Dore, L.; Lessa, R.; Stosic, B. k-Means Algorithm in Statistical Shape Analysis. Commun. Stat. Simul. Comput. 2010, 39, 1016–1026. [Google Scholar] [CrossRef]
Stoyan, D.; Stoyan, H. A further application of D.G. Kendall’s Procrustes Analysis. Biom. J. 1990, 32, 293–301. [Google Scholar] [CrossRef]
Lele, S.; Richtsmeier, J. An Invariant Approach to Statistical Analysis of Shapes; Chapman & Hall/CRC: New York, NY, USA, 2001. [Google Scholar]
Huang, C.; Styner, M.; Zhu, H. Clustering High-Dimensional Landmark-based Two-dimensional Shape Data. J. Am. Stat. Assoc. 2016, 19, 702–723. [Google Scholar] [CrossRef] [PubMed]
Kume, A.; Welling, M. Maximum likelihood estimation for the offset-normal shape distributions using EM. J. Comput. Graph. Stat. 2010, 19, 702–723. [Google Scholar] [CrossRef]
Amari, S.; Nagaoka, H. Methods of Information Geometry; Translations of Mathematical Monographs; AMS & Oxford University Press: Providence, RI, USA, 2000; Volume 191. [Google Scholar]
Murray, M.K.; Rice, J.W. Differential Geometry and Statistics; Chapman & Hall: Boca Raton, FL, USA, 1984. [Google Scholar]
Srivastava, A.; Joshi, S.H.; Mio, W.; Liu, X. Statistical Shape analysis: Clustering, learning, and testing. IEEE Trans. PAMI 2005, 27, 590–602. [Google Scholar] [CrossRef] [PubMed]
Mio, W.; Srivastava, A.; Joshi, S.H. On Shape of Plane Elastic Curves. Int. J. Comput. Vis. 2007, 73, 307–324. [Google Scholar] [CrossRef]
De Sanctis, A.; Gattone, S.A. Methods of Information Geometry to model complex shapes. Eur. Phys. J. Spec. Top. 2016, 225, 1271–1279. [Google Scholar] [CrossRef]
Gattone, S.A.; De Sanctis, A. A shape distance based on the Fisher-Rao metric and its application for shapes clustering. Physical A 2017, 487, 93–102. [Google Scholar] [CrossRef]
Costa, S.; Santos, S.; Strapasson, J. Fisher information distance: A geometrical reading. Discret. Appl. Math. 2015, 197, 59–69. [Google Scholar] [CrossRef]
Takatsu, A. Wasserstein geometry of Gaussian measures. Osaka J. Math. 2011, 48, 1005–1026. [Google Scholar]
Otto, F. The geometry of dissipative evolution equations: the porous medium equation. Commun. Part. Differ. Equ. 2001, 26, 101–174. [Google Scholar] [CrossRef]
Hubert, L.; Arabie, P. Comparing Partitions. J. Classif. 1985, 2, 193–218. [Google Scholar] [CrossRef]

Table 1. Adjusted-Rand index.

Model	Fisher-Rao	Wasserstein
Isotropic-small error	0.8848	0.9553
Isotropic-high error	0.6265	0.7457
Heteroscedastic-small error	0.8854	0.9606
Heteroscedastic-high error	0.8367	0.6616

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sanctis, A.D.; Gattone, S.A. A Comparison between Wasserstein Distance and a Distance Induced by Fisher-Rao Metric in Complex Shapes Clustering. Proceedings 2018, 2, 163. https://doi.org/10.3390/ecea-4-05016

AMA Style

Sanctis AD, Gattone SA. A Comparison between Wasserstein Distance and a Distance Induced by Fisher-Rao Metric in Complex Shapes Clustering. Proceedings. 2018; 2(4):163. https://doi.org/10.3390/ecea-4-05016

Chicago/Turabian Style

Sanctis, Angela De, and Stefano A. Gattone. 2018. "A Comparison between Wasserstein Distance and a Distance Induced by Fisher-Rao Metric in Complex Shapes Clustering" Proceedings 2, no. 4: 163. https://doi.org/10.3390/ecea-4-05016

APA Style

Sanctis, A. D., & Gattone, S. A. (2018). A Comparison between Wasserstein Distance and a Distance Induced by Fisher-Rao Metric in Complex Shapes Clustering. Proceedings, 2(4), 163. https://doi.org/10.3390/ecea-4-05016

Article Menu

A Comparison between Wasserstein Distance and a Distance Induced by Fisher-Rao Metric in Complex Shapes Clustering^†

Abstract

1. Introduction

2. The Method

3. A Simulation Experiment

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

A Comparison between Wasserstein Distance and a Distance Induced by Fisher-Rao Metric in Complex Shapes Clustering †

Abstract

1. Introduction

2. The Method

3. A Simulation Experiment

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

A Comparison between Wasserstein Distance and a Distance Induced by Fisher-Rao Metric in Complex Shapes Clustering^†