A Comparison between Wasserstein Distance and a Distance Induced by Fisher-Rao Metric in Complex Shapes Clustering

Shape Analysis studies geometrical objects, as for example a flat fish in the plane or a human head in the space. The applications range from structural biology, computer vision, medical imaging to archaeology We focus on the selection of an appropriate measurement of distance among observations with the aim of obtaining an unsupervised classification of shapes. Data from a shape are often realized as a set of representative points, called landmarks. For planar shapes, we assume that each landmark is modeled via a bivariate Gaussian, where the means capture uncertainties that arise in landmarks placement and the variances the natural variability across the population of shapes. At first we consider the Fisher-Rao metric as a Riemannian metric on the Statistical Manifold of the Gaussian distributions. The induced geodesic-distance is related with the minimization of information in the Fisher sense and we can use it to discriminate shapes. Another suitable distance is the Wasserstein distance, which is induced by a Riemannian metric and is related with the minimal transportation cost. In this work, a simulation study is conducted in order to make a comparison between Wasserstein and Fisher-Rao metrics when used in shapes clustering.


Introduction
Shapes clustering is of interest in various fields such as geometric morphometrics, computer vision and medical imaging.In the clustering of shapes is crucial to find an appropriate measurement of distance among observations.In particular we are interested to classify shapes which derive from complex systems as expression of self-organization phenomenon.We consider objects whose shapes are based on landmarks ( [1][2][3]).These objects can be obtained by medical imaging procedures, curves defined by manually or automatically assigned feature points or by a discrete sampling of the object contours.
Since the shape space is invariant under similarity transformations, that is translations, rotations and scaling, an Euclidean distance function on such a space is not really meaningful.In Shape Analysis [4], in order to apply standard clustering algorithms to planar shapes, the Euclidean metric has to be replaced by the metric of the shape space.Examples are provided in [5,6] where the Procrustes distance was integrated in standard clustering algorithms such as the k-means.
Similarly, Lele et al. [7] applied standard hierarchical or k-means clustering using dissimilarity measures based on the inter-landmark distances.In a model-based clustering framework [8,9] developed a mixture model of offset-normal shape distributions.
Considering for simplicity a planar shape of a population, we assume that each landmark is modeled via a bivariate Gaussian, where the means are the landmark geometric coordinates and capture uncertainties that arise in landmark placement while the variances derived from the natural variability across the population of shapes.According to Information Geometry, we consider the space of bivariate Gaussian densities as a Statistical Manifold ( [10,11]) with the local coordinates defined by the model parameters.Next, we define distances between objects related with different Riemannian metrics.These distances are induced by the geodesics of the two metrics (geodesic distances).Applications of geodesics to shape clustering techniques are provided, in a landmark-free context, by [12,13].
At first we consider the Fisher-Rao metric as a Riemannian metric on the Statistical Manifold of the Gaussian densities.The induced geodesic-distance is related with the minimization of information in the Fisher sense and we can use it to define a shape distance.
Another suitable distance is the Wasserstein distance, which is induced by a Riemannian metric and is related with the minimal transportation cost.
The geodesic distances induced by Wasserstein and Fisher-Rao metrics can be used to discriminate shapes.The discriminative power of these shapes distances will be evaluated, in the setting of shapes clustering on simulated data.
Following [14], the k-th landmark may be represented by a bivariate Gaussian density as follows: with x being a generic 2-dimensional vector and Σ k given by where {σ 2 k1 ,σ 2 k2 } is the vector of the variances in the horizontal and vertical directions of the k-th landmark coordinates, for k = 1, . . ., K. We remark that the means capture uncertainties that arise in landmark placement and the variances the natural variability across a population of shapes.
From Equation (1) we can assign to the k-th landmark a new set of coordinates given by θ k = (µ k , σ k ) on the 4-dimensional manifold which is the product of two upper half planes.So two planar shapes S and S can be parametrized as follows: S = (θ 1 , . . ., θ K ) and S = (θ 1 , . . ., θ K ).
From differential geometry we know that a given Riemannian metric g induces an inner product < ., .> g on the tangent space of the manifold such that the length of γ k (t) is defined as follows The distance between the k-th landmarks of the two shapes is given by the minimum length of the trajectories γ k (t) (geodesic distance) Finally, the sum of the distances between each pair of landmarks is used to define a distance between two shapes S and S [15].
In the statistical manifold of bivariate Gaussian densities, we will consider two different Riemannian metrics which in turn induce two types of geodesic distances.
One is the Fisher-Rao metric g f .For this metric, the closed form of the geodesic distance between two densities with diagonal covariance matrices is available and given by [16]: For Gaussian densities with Σ being any symmetric positive definite covariance matrix, a closed form for the associated distance is not available.
The other Riemannian metric we consider is g w , which induces the Wasserstein distance ( [17]).For Gaussian densities the explicit expression of the Wasserstein distance is the following: where . is the euclidean norm and Σ 1 2 is defined for a symmetric positive definite matrix Σ so that Σ Otto et al. [18] proved that, with respect to the Riemannian metric which induces the Wasserstein distance, the manifold of Gaussian densities has non-negative sectional curvature.We deduce that the Wasserstein metric is different from the Fisher-Rao metric.For example in the univariate case, it is well known that the statistical manifold of Gaussian densities with the Fisher-Rao metric can be regarded as the upper half plane with the hyperbolic metric, which has negative curvature.

A Simulation Experiment
In this section we report the results of a simple simulation experiment.In order to test the discriminative power of the proposed shape distances we first simulate shapes from two different mean shapes.The Fisher-Rao distance and the Wasserstein distance are evaluated between each pair of shapes and stored in two different pairwise distance matrices.Then we run a hierarchical cluster algorithm which takes as input the pairwise distance matrices computed with the two shapes distances.The quality of the clusters identified with the two shapes distances is measured by means of the Adjusted Rand index ( [19]).
The shapes are simulated from the following Gaussian perturbation model where • E i are zero mean K × 2 random error matrices simulated from the multivariate Normal distribution with covariance structure Σ E • µ h is the mean shape for cluster h • Γ i is an orthogonal rotation matrix with an angle θ uniformly produced in the range [0, 2π] The number of clusters was set to h = 2 and the number of configurations to 30.The mean shapes in the two clusters were taken from the mean skull of 21 rats collected at ages of 7 and 14 days (the rat calvarial data set, [1]).In the isotropic case, two values of σ were used: σ = 10 (small error) and σ = 13 (high error).The heteroscedastic case was simulated by multiplying the value of σ = 10 of 3 landmarks (out of K = 8 landmarks) by a factor of 1.69 (small error) and 3 (high error).Results from 500 random simulations are reported in Table 1.Results show that the Wasserstein distance has a good performance especially in the Isotropic case.As the variability around the landmarks becomes heteroscedastic then the Fisher-Rao distance performs better (Heteroscedastic-high error case).
A more thorough study, however, is needed to analyze advantages and limits of these metrics as a tool for evaluating the differences between shapes.This is the aim of our future work.