Next Article in Journal
Construction and Application of Nine-Tic B-Spline Tensor Product SS
Previous Article in Journal
A Convergence Theorem for the Nonequilibrium States in the Discrete Thermostatted Kinetic Theory
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Geometry of the Generalized Gamma Manifold and an Application to Medical Imaging

1
INSERM ToNIC, Université de Toulouse, CEDEX 3, 31024 Toulouse, France
2
ENAC, Université de Toulouse, CEDEX 4, 31055 Toulouse, France
*
Author to whom correspondence should be addressed.
Mathematics 2019, 7(8), 674; https://doi.org/10.3390/math7080674
Submission received: 18 June 2019 / Revised: 25 July 2019 / Accepted: 26 July 2019 / Published: 29 July 2019

Abstract

:
The Fisher information metric provides a smooth family of probability measures with a Riemannian manifold structure, which is an object in information geometry. The information geometry of the gamma manifold associated with the family of gamma distributions has been well studied. However, only a few results are known for the generalized gamma family that adds an extra shape parameter. The present article gives some new results about the generalized gamma manifold. This paper also introduces an application in medical imaging that is the classification of Alzheimer’s disease population. In the medical field, over the past two decades, a growing number of quantitative image analysis techniques have been developed, including histogram analysis, which is widely used to quantify the diffuse pathological changes of some neurological diseases. This method presents several drawbacks. Indeed, all the information included in the histogram is not used and the histogram is an overly simplistic estimate of a probability distribution. Thus, in this study, we present how using information geometry and the generalized gamma manifold improved the performance of the classification of Alzheimer’s disease population.

1. Introduction

In the medical field, the use of medical imaging techniques, such as Magnetic Resonance Imaging (MRI), is particularly important for measuring brain activity in different parts of the anatomy of the central nervous system. This technique is of major interest in the context of neurodegenerative diseases such as Alzheimer’s disease [1,2,3]. By measuring the cortical thickness of the brain, it is possible to estimate the brain atrophy that is considered as a crucial marker of neurodegeneration in Alzheimer’s disease [4,5]. Previous works have already studied measures of central tendency of biomarkers of the brain activity. These are next implemented in a statistical data analysis, such as classification algorithms for detecting changes in the pathological disease. The commonly used summaries are the mean, the median or the mode, but these are not informative enough on the distribution of the data.
In most recent studies [6,7], histograms have been used as an approximation of the probability density functions, but only a few of their characteristics, such as mean, percentiles, peak location, peak height, skweeness and kurtosis, were used in a statistical data analysis. In the context of multiple sclerosis, the entire histogram information was implemented in a k-nearest neighbors classifier, with higher classification performances than in previous studies [8]. However, the quality of the histogram estimation depends on the choice of the bin size. The histogram may then be a too rough estimate of the probability density function, and provide poor estimates of the characteristics cited above. For overcoming this drawback and using the entire information included in the data, the underlying probability density functions themselves should be used as a biomarker of the whole brain. The general framework of the information geometry, in which the probability distributions are considered as points on a manifold, is particularly relevant to reach this goal.
The generalized gamma distribution was introduced in [9], and can be viewed as a special case of the Amoroso distribution [10] in which the location parameter is dropped [11]. Apart from the gamma distribution, it generalizes also the Weibull distribution and is of common use in survival models. Moreover, the generalized gamma distribution is particularly relevant in the medical field previously described.
The purpose of the present work is to investigate some information geometric properties of the generalized gamma family, especially when restricted to the gamma submanifold. First, in Section 2, the Fisher information as a Riemannian metric and results in the case of the gamma manifold will be briefly introduced. Next, in Section 3, the case of the generalized gamma manifold will be detailed, using an approach based on diffeomorphism groups. In Section 4, the extrinsic curvature of the gamma submanifold will be computed. Finally, an example of application in the medical imaging domain will be given in the last section. A clustering technique has been successfully extended by using a geodesic distance of which an approximation is computed in two steps for numerical considerations.

2. Information Geometry and the Gamma Manifold

Information geometry deals with parameterized families of distributions whose parameters are understood as coordinates and provided with a Riemannian structure by the Fisher metric [12]. Let Θ be a smooth manifold and P a family of probability distributions defined on a common event space, parameterized by θ Θ , and absolutely continuous with respect to a fixed measure μ . It is further assumed that the corresponding density functions are smooth with respect to the θ parameter. In the sequel, p θ will denote the density function for a given θ . Thorough the paper, the Einstein summation convention on repeated indices will be used.
Definition 1.
The Fisher information metric on Θ is defined at point θ Θ by the symmetric order 2 tensor:
g = g i j d θ i d θ j ,
where:
g i j = E p θ θ i l θ j l , l ( θ ) = log p θ .
When the support of the density functions p θ does not depend on θ , the information metric can be rewritten as:
g i j = E p θ θ i θ j l .
It gives rise to a Riemannian metric on Θ .
The Fisher information metric is invariant under change of variables by sufficient statistics [13,14]. When the parameterized family p θ is of natural exponential type, the Fisher information metric can be expressed as
g i j ( θ ) = 2 ϕ θ i θ j ( θ ) ,
where ϕ is the log-partition function.
A manifold with such a Riemannian metric is referred to as a Hessian structure [15]. Many important tools from Riemannian geometry, like the Levi–Civita connection, are greatly simplified within this frame. In the sequel, all partial derivatives θ i will be abbreviated by i .
Proposition 1.
For a parameterized density family p θ , θ Θ pertaining to the natural exponential class with log-partition function ϕ, the Christoffel symbols of the first kind for the Levi–Civita connection of the associated Hessian structure are given by [16]:
Γ i j k = 1 2 i j k ϕ .
The gamma distribution can be written as a natural exponential family on two parameters ( α , λ ) , defined on the parameter space by:
Definition 2.
The gamma distribution is the probability law on R + { 0 } with density relative to the Lebesgue measure given by:
p ( x ; α , λ ) = 1 Γ ( λ ) α λ x λ 1 e x α , x > 0 ,
with parameters α > 0 , λ > 0 .
The next proposition comes directly from the definition:
Proposition 2.
The gamma distribution defines a natural exponential family with natural parameters λ and η = α 1 and potential function ϕ ( η , λ ) = log Γ ( λ ) λ log ( η ) .
Using (2), the Fisher metric is obtained by a straightforward computation:
g ( η , λ ) = λ η 2 1 η 1 η ψ ( λ ) ,
where ψ is the digamma function.
It is sometimes convenient to perform a change of parameterization in order to have a diagonal form for the metric. The next proposition is of common use and allows the computation of a pullback metric in local coordinates:
Proposition 3.
Let M be a smooth manifold and ( N , g ) be a smooth Riemannian manifold. For a smooth diffeomorphism f : M N , the pullback metric f * g has matrix expressed in local coordinates at the point m M by:
J f t ( m ) G ( f ( m ) ) J f ( m ) ,
with J f ( m ) the Jacobian matrix of f at m and G ( n ) the matrix of the metric g at n N .
Performing the change of parameterization f : ( μ , β ) ( η = β / μ , λ = β ) yields:
J f ( μ , β ) = μ β 2 1 β 0 1 .
Using Proposition 3 then gives for the pullback metric matrix:
G ( μ , β ) = β μ 2 0 0 ψ ( β ) 1 β .
The information geometry of the gamma distribution is studied in detail in [17], with explicit calculations of the Christoffel symbols and the geodesic equation.

3. The Geometry of the Generalized Gamma Manifold

While the gamma distribution is well suited to study departure to full randomness, as pointed out in [17], it is not general enough in many applications. In particular, the Weibull distribution that also generalizes the exponential distribution is not a gamma distribution. A more general family was thus introduced, by adding a power term.
Definition 3.
The generalized gamma distribution is the probability measure on R + { 0 } with density respective to the Lesbesgue measure given by:
p ( x ; α , λ , β ) = β x β λ 1 α β λ Γ ( λ ) e x α β , x > 0 ,
where α > 0 , λ > 0 , β > 0 .
Due to the exponent β , the generalized gamma distribution does not define a natural exponential family. However, keeping β fixed, the mapping Φ β : x x β is a diffeomorphism of R + to itself, and the image density of p ( . ; α , λ , β ) under Φ β is a gamma density with parameters ( α β , λ ) . For any κ > 0 , the submanifold β = κ of the generalized gamma manifold is diffeomorphic to the gamma manifold. Using the invariance of the Fisher metric under diffeomorphisms, the induced metric on the above submanifold can be obtained.
Proposition 4.
Let κ > 0 be a fixed real number. The induced Fisher metric matrix G κ on the submanifold ( α , λ , κ ) of the generalized gamma manifold is given in local coordinates by:
G κ ( α , λ ) = λ κ 2 α 2 κ α κ α ψ ( λ ) .
Proof. 
In local coordinates ( α κ , λ ) , the Fisher metric matrix of a gamma distribution manifold ( α κ , λ )  is
G κ ( α κ , λ ) = λ α 2 κ 1 α κ 1 α κ ψ ( λ ) .
The Jacobian matrix of the transformation ( α , λ ) ( α κ , λ ) is the matrix J = diag ( κ α κ 1 , 1 ) and the change of parameterization yields:
G κ ( α , λ ) = J t G κ ( α κ , λ ) J .
The Fisher metric matrix on the submanifold ( α , λ , κ ) is directly obtained from the invariance by using the diffeomorphism Φ β : x x β .  □
Proposition 5.
In local coordinates, the Fisher information metric matrix of the generalized gamma manifold is given by:
G ( α , λ , β ) = β 2 λ α 2 β α λ ψ ( λ ) 1 α β α ψ ( λ ) ψ ( λ ) β λ ψ ( λ ) 1 α ψ ( λ ) β λ ψ ( λ ) 2 + 2 ψ ( λ ) + λ ψ ( λ ) + 1 β 2 .
Proof. 
The 2 × 2 submatrix corresponding to the local coordinates α , λ has already been obtained in Proposition 4. The remaining terms can be computed by differentiating the log-likelihood function twice, but an alternative will be given below in a more general setting.  □
The usual definition of the generalized gamma distribution Definition 3 does stem from the gamma one by a simple change of variable, thus making some computation less natural. Starting with the above diffeomorphism Φ β and applying it to a gamma distribution yields an equivalent, but more intuitive form. Furthermore, it is advisable to express the gamma density as a natural exponential family distribution:
p ( x ; η , λ ) = η λ x λ 1 e η x Γ ( λ ) , x > 0 ,
where λ > 0 , η > 0 are the natural parameters of the distribution.
Definition 4.
The generalized gamma distribution on R + { 0 } is the probability measure with density:
p ( x ; η , λ , β ) = β η λ x β λ 1 e η x β Γ ( λ ) , x > 0 ,
with η > 0 , λ > 0 and β > 0 .
Due to the invariance by diffeomorphism property of the Fisher information metric, the induced metric on the submanifolds β = constant is independent of β , and is exactly the one of the gamma manifold, here given by the matrix:
G ( η , λ ) = λ η 2 1 η 1 η ψ ( λ ) .
An important fact about the family of diffeomorphisms Φ β is the group property Φ β 1 Φ β 2 = Φ β 1 β 2 . It turns out that all the computation can be conducted in a general Lie group setting, as detailed below. Let p θ , θ Θ be a parameterized family of probability densities defined on an open subset U of R n and let W be a Lie group acting on U by diffeomorphisms preserving orientation. For any w in W and θ in Θ , the image density p ˜ w , θ under the diffeomorphism x U ξ ( w , x ) = w . x is given by:
x U , p ˜ w , θ ( x ) = p θ ( ξ ( w , x ) ) det 2 ξ ( w , x ) .
Note that we consider diffeomorphisms preserving orientation. For simplicity of calculus, the absolute value may be removed in the above expression. Denoting l ˜ ( x , θ , w ) the log-likelihood of p ˜ w , θ ( x ) and l ( x , θ ) the one of p θ ( x ) , it comes, by obvious computation:
x U , l ˜ ( x , θ , w ) = l ( ξ ( w , x ) , θ ) + log det 2 ξ ( w , x ) .
In this section, the symbol i stands for the partial derivative with respect to the i-th variable. Higher order derivatives are written similarly as i i , j j , by repeating the variable k times to indicate a partial derivative of order k.
Proposition 6.
For any x U , w W :
1 ξ ( w , x ) = 1 ξ ( e , ξ ( w , x ) ) T w R w 1 ,
where e is the identity of W and R w is the right translation mapping h W R w . h = h . w .
Proof. 
Since ξ comes from a group action:
ξ ( h , ξ ( w , x ) ) = ξ ( h . w , x ) .
Then, taking the derivative with respect to h at identity:
1 ξ ( e , ξ ( w , x ) ) = 1 ξ ( w , x ) T e R w .
Since T e R w T w R w 1 = I d by the chain rule, the claimed result is proved.  □
This property allows for computing the Fisher information metric in a convenient way.
Proposition 7.
The element g w , θ of the Fisher metric matrix of p ˜ w , θ is given by:
U 12 l ( x , θ ) 1 ξ ( e , x ) p θ ( x ) d x T w R w 1 .
Proof. 
Since:
l ˜ ( x , θ , w ) = l ( ξ ( w , x ) , θ ) + log det 2 ξ ( w , x ) ,
it comes:
2 l ˜ ( x , θ , w ) = 2 l ( ξ ( w , x ) , θ ) ,
and thus:
23 l ˜ ( x , θ , w ) = 12 l ( ξ ( w , x ) , θ ) 1 ξ ( w , x ) .
Now, using Proposition 6:
23 l ˜ ( x , θ , w ) = 12 l θ ( ξ ( w , x ) , θ ) 1 ξ ( e , ξ ( w , x ) ) T w R w 1 .
Taking the expectation with respect to p ˜ w , θ yields:
E [ 23 ] = U 12 l ( ξ ( w , x ) , θ ) 1 ξ ( e , ξ ( w , x ) ) p ˜ w , θ ( x ) d x T w R w 1
and the result follows by the change of variable y = ξ ( w , x ) .  □
The case of the elements g w , w is a little bit more complex, due to the non-vanishing extra term in the log-likelihood l ˜ ( x , θ , w ) . Taking the first derivative with respect to w yields:
x U , 3 l ˜ ( x , θ , w ) = 1 l ( ξ ( w , x ) , θ ) 1 ξ ( w , x ) + tr 12 ξ ( w , x ) 2 ξ ( w , x ) 1 ,
where tr denotes the trace of the linear application with respect to the x components. The second term on the right-hand side can be further simplified using the next proposition that is a direct consequence of Proposition 6.
Proposition 8.
For any θ Θ , w W , x U :
12 ξ ( e , ξ ( w , x ) ) 2 ξ ( w , x ) = 12 ξ ( w , x ) T e R w .
Applying it to the log-likelihood derivative and using again Proposition 6 yields:
x U , 3 l ˜ ( x , θ , w ) = 1 l ( ξ ( w , x ) , θ ) 1 ξ ( e , ξ ( w , x ) ) + tr 12 ξ ( e , ξ ( w , x ) ) T w R w 1 .
Proposition 9.
The element g w , w of the Fisher metric matrix of p ˜ w , θ is given in matrix form by:
T w R w 1 T U h w , θ ( x ) T h w , θ ( x ) p θ ( x ) d x T w R w 1 ,
with:
h w , θ ( x ) = 1 l ( x , θ ) 1 ξ ( e , x ) + tr 12 ξ ( e , x ) .
Proof. 
Starting with the definition:
g w , w = E [ ( 3 l ˜ ) T ( 3 l ˜ ) ] ,
the result follows after the change of variable y = ξ ( w , x ) in the expectation. □
An important corollary of Propositions 7 and 9 is that the Fisher metric can be expressed as a right invariant metric on the Lie group W .
Propositions 7 and 9 allow for computing the coefficients g η β , g λ β , g β β in the Fisher metric matrix, thus yielding the next proposition.
Proposition 10.
The Fisher information matrix in natural coordinates has coefficients:
g η η = λ η 2 , g η λ = 1 η , g λ λ = ψ ( λ ) , g η β = λ η β ψ ( λ + 1 ) log η , g λ β = 1 β log η ψ ( λ ) , g β β = 1 β 2 1 + λ log 2 η 2 λ ψ ( λ + 1 ) log η + λ ψ 2 ( λ + 1 ) + λ ψ ( λ + 1 ) .
Recalling that the Christoffel symbols of the first kind for the Levi–Civita connection are obtained using the formula:
Γ k i j = 1 2 i g j k + j g i k k g i j ,
one can obtain them as:
Γ 111 = λ η 3 , Γ 211 = 1 2 η 2 , Γ 311 = λ ( 1 + log η ψ ( λ + 1 ) ) η 2 β , Γ 121 = Γ 211 = Γ 112 , Γ 221 = Γ 212 = 0 , Γ 321 = Γ 312 = 1 log η + ψ ( λ + 1 ) + λ ψ ( λ + 1 ) 2 η β , Γ 122 = 0 , Γ 222 = 1 2 ψ ( λ ) , Γ 322 = ψ ( λ ) β ,
Γ 131 = Γ 113 = 0 , Γ 231 = Γ 213 = 1 + log η ψ ( λ + 1 ) λ ψ ( λ + 1 ) 2 η β , Γ 331 = Γ 313 = λ ( log η ψ ( λ + 1 ) ) η β 2 , Γ 132 = Γ 123 = 1 log η + ψ ( λ + 1 ) + λ ψ ( λ + 1 ) 2 η β , Γ 232 = Γ 223 = 0 , Γ 332 = Γ 323 = ψ ( λ + 1 ) ( 1 2 λ log η ) 2 ψ ( λ + 1 ) ( log η λ ψ ( λ + 1 ) ) + log 2 η + ψ ( λ + 1 ) 2 + λ ψ ( λ + 1 ) 2 β 2 , Γ 133 = 0 , Γ 233 = 2 ψ ( λ + 1 ) ( log η λ ψ ( λ + 1 ) ) + ψ ( λ + 1 ) ( 1 2 λ log η ) + log η ( log η + 2 ) + ψ ( λ + 1 ) 2 2 ψ ( λ ) + λ ψ ( λ + 1 ) 2 β 2 , Γ 333 = λ log 2 η + λ 2 log η ψ ( λ + 1 ) + ψ ( λ + 1 ) 2 + ψ ( λ + 1 ) + 1 β 3 .

4. The Gamma Submanifold

The submanifolds β = constant of the generalized gamma manifold are all isometric to the gamma manifold. This section is dedicated to the study of their properties using the Gauss–Codazzi equations. In the sequel, the generalized gamma manifold will be denoted by M while N κ , κ > 0 will stand for the embedded submanifold β = κ .
Proposition 11.
The normal bundle to N κ is generated at ( η , λ ) on the gamma submanifold by the vector:
n ( η , λ ) = η ( λ ψ ( λ ) ( ψ ( λ + 1 ) log ( η ) ) + log ( η ) ψ ( λ ) ) , 1 , κ λ ψ ( λ ) 1 .
Proof. 
The matrix of the Fisher metric at ( η , λ , β ) can be written in block form as:
G ( η , λ , β ) = g ( η , λ ) v v t g β β ,
with:
G ( η , λ ) = λ η 2 1 η 1 η ψ ( λ )
and
v = λ η β ( ψ ( λ + 1 ) log η ) 1 β ( log η ψ ( λ ) ) .
Any multiple of the vector:
( G ( η , λ ) 1 v , 1 )
is normal to the tangent space to the submanifold N κ . The result follows by simple computation. □
Let ∇ be the Levi–Civita connection of the gamma manifold and ¯ that of the generalized gamma. It is well known [18] (pp. 60–63) that these two connections are related by the Gauss formula:
X , Y T N κ , ¯ X Y = X Y + B ( X , Y ) ,
where B is a symmetric bilinear form with values in the normal bundle. Letting n = n i e i with e 1 = η , e 2 = λ , e 3 = β , it comes, with i , j = 1 2 :
g ¯ e i e j , n = n k Γ ¯ k i j = g e i e j , n + g B ( e i , e j ) , n .
Since B takes its values in the normal bundle, it exists a smooth real value mapping a i j , i , j = 1 2 such that B ( e i , e j ) = a i j n . Equation (12) yields:
a i j = n k Γ ¯ k i j g ( n , n ) .
From [18] (p. 63), the sectional curvature K ¯ ( e 1 , e 2 ) of M can be obtained from the one K ( e 1 , e 2 ) of N κ as:
K ¯ ( e 1 , e 2 ) = K ( e 1 , e 2 ) + g B ( e 1 , e 2 ) , B ( e 1 , e 2 ) g B ( e 1 , e 1 ) , B ( e 2 , e 2 ) g ( e 1 , e 1 ) g ( e 2 , e 2 ) g ( e 1 , e 2 ) 2 ,
or:
K ¯ ( e 1 , e 2 ) = K ( e 1 , e 2 ) + g ( n , n ) a 12 2 a 11 a 22 g 11 g 22 g 12 2 .
Using the expressions of the Christoffel symbols and the metric, the coefficients a 11 , a 12 , a 22 can be computed as:
a 11 = 2 λ ( 1 λ ψ ( λ ) ) + 1 2 η 2 D ,
a 12 = λ 2 ψ ( λ ) 2 ψ ( λ ) 1 2 η D ,
a 22 = ψ ( λ ) ( 1 λ ψ ( λ ) ) ψ ( λ ) / 2 D ,
with:
D = g ( n , n ) = ( λ ψ ( λ ) 1 ) ( ψ ( λ ) ( λ 2 ψ ( λ ) 1 ) 1 ) .
Finally:
g ( n , n ) a 12 2 a 11 a 22 g 11 g 22 g 12 2 = F ( λ ) / G ( λ ) ,
with:
F ( λ ) = λ 4 ψ ( λ ) 4 2 λ 2 ( 2 λ + 1 ) ψ ( λ ) 3 + 6 λ 2 + 2 λ + 1 ψ ( λ ) 2 2 λ ( λ ψ ( λ ) + 2 ) ψ ( λ ) + ( 2 λ + 1 ) ψ ( λ ) + 1 ,
and:
G ( λ ) = 4 ( λ ψ ( λ ) 1 ) 2 ψ ( λ ) λ 2 ψ ( λ ) 1 1 .
Proposition 12.
The term a 12 2 a 11 a 22 is strictly positive.
Proof. 
Using the expressions of the coefficients:
a 12 2 a 11 a 22 = 1 4 η 2 D 2 A ( λ ) + B ( λ ) C ( λ ) ,
with:
A ( λ ) = ( λ 2 ψ ( λ ) 2 ψ ( λ ) 1 ) 2 , B ( λ ) = 2 λ ( 1 λ ψ ( λ ) ) + 1 , C ( λ ) = 2 ψ ( λ ) ( 1 + λ ϕ ( λ ) ) + ψ ( λ ) .
The ψ function satisfies the next inequality [19]:
1 λ + 1 2 λ 2 < ψ ( λ ) < 1 λ + 1 λ 2 ,
from which it comes:
1 2 λ > 1 λ ψ ( λ ) > 1 λ ,
and it turns:
0 > B ( λ ) > 1 .
To obtain the sign of C ( λ ) , a different bound is needed for the polygamma function. Again from [19]:
( k 1 ) ! ( x + 1 ) k + k ! x k + 1 < ψ ( k ) < ( k 1 ) ! ( x + 1 / 2 ) k + k ! x k + 1 , k 1 .
Using the inequality (20), it comes:
λ + 1 λ ( 2 λ + 1 ) < λ ψ ( λ ) 1 ,
so that:
1 λ + 1 / 2 + 1 λ 2 λ + 1 λ ( 2 λ + 1 ) < ψ ( λ ) ( 1 + λ ϕ ( λ ) ) .
Using again (20) with k = 2 yields finally:
C ( λ ) < 2 λ 2 ( 1 + 2 λ ) 2 .
Since both B ( λ ) and C ( λ ) are strictly negative, A ( λ ) + B ( λ ) C ( λ ) is strictly positive as claimed. □
Proposition 13.
The sectional curvature of the generalized gamma manifold in the ( e 1 , e 2 ) satisfies:
K ¯ ( e 1 , e 2 ) λ > 0 + 12 π 2 2 ( π 2 6 ) .
Proof. 
The sectional curvature of the gamma manifold satisfies [17]:
K ( e 1 , e 2 ) λ > 0 + 1 2 .
It is thus only needed to estimate the limit of (19) when λ 0 + . The asymptotics of the polygamma functions at 0 are given by:
ψ ( λ ) = 1 λ 2 + ψ ( 1 ) + o ( 1 ) , ψ ( λ ) = 2 λ 3 + ψ ( 1 ) + o ( 1 ) .
The term:
F ( λ ) = λ 4 ψ ( λ ) 4 2 λ 2 ( 2 λ + 1 ) ψ ( λ ) 3 + 6 λ 2 + 2 λ + 1 ψ ( λ ) 2 2 λ ( λ ψ ( λ ) + 2 ) ψ ( λ ) + ( 2 λ + 1 ) ψ ( λ ) + 1
can thus be approximated by:
π 8 x 6 24 π 6 x 5 + 12 π 6 x 4 + 216 π 4 x 4 432 π 2 x 4 ψ ( 1 ) 360 π 4 x 3 864 π 2 x 3 + 2592 x 3 ψ ( 1 ) + 36 π 4 x 2 + 2592 π 2 x 2 + 1296 x 2 1296 x 2 ψ ( 1 ) 864 π 2 x 5184 x + 2592 / ( 1296 x 2 ) ,
and the term:
G ( λ ) = 4 ( λ ψ ( λ ) 1 ) 2 ψ ( λ ) λ 2 ψ ( λ ) 1 1
is approximated by:
π 2 x 2 6 x + 6 2 π 4 x 2 + 6 π 2 36 324 x 2 .
Finally, the quotient F ( λ ) / G ( λ ) is equal at λ = 0 to
3 π 2 6
and the result follows by summation with 1 / 2 . □
It is conjectured that the sectional curvature of the generalized gamma manifold in the directions η , λ is strictly positive, bounded from above by 1/2 as it appears to be the case numerically.

5. Medical Imaging Application

Magnetic Resonance Imaging (MRI) seeks to identify, localize and measure different parts of the anatomy of the central nervous system. It is of common use for the diagnosis of neurodegenerative diseases such as Alzheimer’s disease [1,2,3]. The brain atrophy can be estimated from the measure of the cortical thickness [4,5].
Many of these studies limited their work by using aggregated measures such as the mean or the median while the most recent ones used histogram-analysis [20,21]. In the present work, a generalized gamma density function will be used in place of the histogram to model the distribution of the cortical thickness.

5.1. Study Set-Up and Design

Data used in this paper were obtained from the Alzheimer’s disease Neuroimaging Initiative (ADNI) database http://adni.loni.usc.edu/about/ which aims at providing researchers with an expertized database of several biomarkers. Access is granted upon an online approval process. The details of the experimental setup can be found on the ADNI website under the “MRI Acquisition” tab.
Our study is based on a selected subset of the ADNI population, comprising 143 subjects; 71 healthy controls (HC) subjects and 72 Alzheimer’s disease (AD) patients whose characteristics are summarized in Table 1.
Raw images were preprocessed by gradwarping, intensity correction and scaling. Only high quality images were kept in the final dataset.

5.2. Cortical Thickness Measurement and Distribution

Cortical thickness was chosen as the MRI biomarker because of its ability to quantify morphological alterations of the cortical mantle in early stage of AD. Cortical Thickness (CTh) was measured using the Matlab Toolbox CorThiZon [22] and delivered as a vector of thickness values sampled evenly along the medial axis of the cortex.
On each vector of cortical thicknesses, a generalized gamma density in the form (6) is fitted using the method of moments described in [23], yielding the estimates of three generalized gamma parameters ( α , λ , β ) which are then converted to natural parameters. At the end of the data processing phase, a dataset of 143 estimated generalized gamma densities is released.

5.3. Clustering

Clustering, also called unsupervised classification, has been extensively studied for years in many fields, such as data mining, pattern recognition, image segmentation and bioinformatics. This technique aims at grouping samples in subsets of the original dataset, such that all elements in a given subset are as similar as possible, while they differ as much as possible from the ones belonging to other subsets. Depending on the exact quality criterion, several formulations can be made. Three principal categories of clustering exist in literature, partitioning clustering, hierarchical clustering and density-based clustering. In the first category, the popular k-means problem applies to vector samples in R p and finds a partition of the original dataset D into k subsets D 1 , , D k that minimize the total intra-class variance:
i = 1 k # D i Var D i ,
where # D i denotes the number of elements in D i . It can be formulated as a vector quantization problem that is to find an optimal sequence c 1 , , c k of vectors from R p that minimizes:
i = 1 k x D i x c i 2 ,
where D i is the subset of points from the dataset located in the Voronoï cell of center c i . It was proved to be NP-hard [24], even with only two classes. Existing algorithms will thus only seek locally optimal solutions. A popular choice is Llyod’s algorithm [25], which is a gradient based local minimization procedure. It can be extended to the Riemannian case [26], but requires at each iteration the computation of the geodesics between a sample and the cell centers. In our study, the experiments were conducted using a slightly different partitioning algorithm, the k-medoids [27]. Compared to Llyod’s algorithm, it is generally considered to be slower, but requires only a dissimilarity measure between pairs of points of the dataset instead of a true distance. It is more suited to small samples, since iterations are made without having to recompute distances. In the context of clustering on Riemannian manifolds, this is a distinguished advantage as geodesic computations are expensive. It is also robust to outliers [28]. The different steps of our k-medoids algorithm are summarized in Algorithm 1.
Algorithm 1k-medoids algorithm.
  Initialization: Select randomly k samples as the initial medoids.
  repeat
      Calculate dissimilarity between each medoid m and the remaining data objects.
      Assign the non-medoid object o i to the closest medoid m.
      Compute the total cost variation δ S of swapping the medoid m with o i .
      if δ S < 0 then
          swap m with o i to form the new set of medoids.
      end if
  until No improvement on total cost.
The dissimilarity matrix used in the k-medoids algorithm was computed using the following similarity measures:
  • Geodesic distance on the generalized gamma manifold (DGG1),
  • Approximate geodesic distance on the generalized gamma manifold (DGG2),
  • Geodesic distance on the gamma manifold (DG),
  • Absolute value distance between empirical means (DM),
  • Kullback–Leibler divergence for generalized gamma distributions (KL).
The (DGG1) and (DG) distances are computed between two points x , y on the respective manifold by solving the geodesic equation:
d 2 t 2 γ k ( t ) + Γ i j k d d t γ i ( t ) d d t γ j ( t ) = 0 ,
γ ( 0 ) = x ,
γ ( 1 ) = y .
A shooting method was selected for this boundary value problem. It converges in any cases for (DG) but failed to converge on ten pairs for (DG1). The approximate geodesic distance (DGG2) was defined to circumvent this issue. It is based on the observation that the gamma manifold is isometrically embedded in the generalized gamma manifold when β is constant. There is thus a Riemannian submersion π defined in coordinates by ( η , λ , β ) ( η , λ ) . An approximate distance can then be obtained by considering separately the vertical and the horizontal part. Let p ( η 2 , λ 2 , β 2 ) , p ( η 1 , λ 1 , β 1 ) be two generalized gamma densities. The energy E 1 of the vertical path t [ 0 , 1 ] γ β ( t ) = η 1 , λ 1 , ( 1 t ) β 1 + t β 2 is computed using the formula:
E 1 = 0 1 g γ β ( t ) ; d d t γ β ( t ) , d d t γ β ( t ) d t = ( β 2 β 1 ) 2 0 1 g β β ( γ β ( t ) ) d t .
Then, the energy E 2 of the geodesic joining p ( η 1 , λ 1 , β 2 ) and p ( η 2 , λ 2 , β 2 ) is computed on the gamma submanifold as the infimum of the integrals:
0 1 g γ ( t ) ; d d t γ ( t ) , d d t γ ( t ) d t ,
where γ is a smooth path joining the two previous points with β constant equal to β 2 . The overall approximate distance is then taken to be E 1 + E 2 . Please note that it is only a similarity measure, like the Kullback–Leibler divergence, as it fails to be symmetric.

5.4. Results

The overall process is summarized in Figure 1 below.
The quality of the clustering results was assessed using purity that is the proportion of well classified samples. Since the k-medoids algorithm has a random initialization, the values given in Table 2 were computed as the mean of the purity on 100 runs.
The best results were obtained using (DGG1) and (DGG2), but, in the first case, one must keep in mind that some distances were impossible to compute: the corresponding samples were thus removed from the dataset. The reference method in the medical imaging community is (DM), which performs slightly better than (DG) and (KL). Due to the small size of the dataset, more testing is needed. A new study with different biomarkers is ongoing.

Author Contributions

F.N. and S.P. have contributed equally to the geometry of the generalized gamma manifold. S.R. has written the part on medical applications.

Funding

This research was funded by by the Fondation pour la Recherche Medicale (FRM Grant No. ECO20160736068 to S.R) and by the ANR grant ANR-14-CE27-0006.

Acknowledgments

Data used in the preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.ucla.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.ucla.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Vemuri, P.; Jack, C.R. Role of structural MRI in Alzheimer’s disease. Alzheimers Res. Ther. 2010, 2, 23. [Google Scholar] [CrossRef] [PubMed]
  2. Cuingnet, R.; Gerardin, E.; Tessieras, J.; Auzias, G.; Lehéricy, S.; Habert, M.O.; Chupin, M.; Benali, H.; Colliot, O.; Alzheimer’s Disease Neuroimaging Initiative. Automatic classification of patients with Alzheimer’s disease from structural MRI: a comparison of ten methods using the ADNI database. NeuroImage 2011, 56, 766–781. [Google Scholar] [CrossRef] [PubMed]
  3. Lama, R.K.; Gwak, J.; Park, J.S.; Lee, S.W. Diagnosis of Alzheimer’s Disease Based on Structural MRI Images Using a Regularized Extreme Learning Machine and PCA Features. J. Healthc. Eng. 2017, 2017, 5485080. [Google Scholar] [CrossRef] [PubMed]
  4. Pini, L.; Pievani, M.; Bocchetta, M.; Altomare, D.; Bosco, P.; Cavedo, E.; Galluzzi, S.; Marizzoni, M.; Frisoni, G.B. Brain atrophy in Alzheimer’s Disease and aging. Ageing Res. Rev. 2016, 30, 25–48. [Google Scholar] [CrossRef] [PubMed]
  5. Busovaca, E.; Zimmerman, M.E.; Meier, I.B.; Griffith, E.Y.; Grieve, S.M.; Korgaonkar, M.S.; Williams, L.M.; Brickman, A.M. Is the Alzheimer’s disease cortical thickness signature a biological marker for memory? Brain Imaging Behav. 2016, 10, 517–523. [Google Scholar] [CrossRef] [PubMed]
  6. Cercignani, M.; Inglese, M.; Pagani, E.; Comi, G.; Filippi, M. Mean Diffusivity and Fractional Anisotropy Histograms of Patients with Multiple Sclerosis. Am. J. Neuroradiol. 2001, 22, 952–958. [Google Scholar] [PubMed]
  7. Dehmeshki, J.; Silver, N.; Leary, S.; Tofts, P.; Thompson, A.; Miller, D. Magnetisation transfer ratio histogram analysis of primary progressive and other multiple sclerosis subgroups. J. Neurol. Sci. 2001, 185, 11–17. [Google Scholar] [CrossRef]
  8. Rebbah, S.; Delahaye, D.; Puechmorel, S.; Maréchal, P.; Nicol, F.; Berry, I. Classification of Multiple Sclerosis patients using a histogram-based K-Nearest Neighbors algorithm. In Proceedings of the OHBM 2019, 25th Annual Meeting of Organization for Human Brain Mapping, Rome, Italy, 14 June 2019. [Google Scholar]
  9. Stacy, E.W. A Generalization of the Gamma Distribution. Ann. Math. Stat. 1962, 33, 1187–1192. [Google Scholar] [CrossRef]
  10. Amoroso, L. Ricerche intorno alla curva dei redditi. Annali di Matematica Pura ed Applicata 1925, 2, 123–159. [Google Scholar] [CrossRef]
  11. Crooks, G.E. The Amoroso Distribution. arXiv 2010, arXiv:1005.3274. [Google Scholar]
  12. Amari, S. Information Geometry and Its Applications; Applied Mathematical Sciences; Springer: Tokyo, Japan, 2016. [Google Scholar]
  13. Calin, O.; Udrişte, C. Geometric Modeling in Probability and Statistics; Mathematics and Statistics; Springer International Publishing: Cham, Switzerland, 2014. [Google Scholar]
  14. Amari, S.; Nagaoka, H. Methods of Information Geometry; Translations of Mathematical Monographs; American Mathematical Society: Providence, RI, USA, 2007. [Google Scholar]
  15. Shima, H. Geometry of Hessian Structures. In Geometric Science of Information; Nielsen, F., Barbaresco, F., Eds.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 37–55. [Google Scholar]
  16. Duistermaat, J. On Hessian Riemannian structures. Asian J. Math. 2001, 5, 79–91. [Google Scholar] [CrossRef] [Green Version]
  17. Arwini, K.; Dodson, C.; Doig, A.; Sampson, W.; Scharcanski, J.; Felipussi, S. Information Geometry: Near Randomness and Near Independence; Information Geometry: Near Randomness and Near Independence; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
  18. Chavel, I. Riemannian Geometry: A Modern Introduction; Cambridge Studies in Advanced Mathematics; Cambridge University Press: Cambridge, UK, 2006. [Google Scholar]
  19. Guo, B.-N.; Qi, F.; Zhao, J.-L.; Luo, Q.-M. Sharp Inequalities for Polygamma Functions. Math. Clovaca 2015, 65, 103–120. [Google Scholar] [Green Version]
  20. Ruiz, E.; Ramírez, J.; Górriz, J.M.; Casillas, J.; Alzheimer’s Disease Neuroimaging Initiative. Alzheimer’s Disease Computer-Aided Diagnosis: Histogram-Based Analysis of Regional MRI Volumes for Feature Selection and Classification. J. Alzheimers Dis. 2018, 65, 819–842. [Google Scholar] [CrossRef] [PubMed]
  21. Giulietti, G.; Torso, M.; Serra, L.; Spanò, B.; Marra, C.; Caltagirone, C.; Cercignani, M.; Bozzali, M.; Alzheimer’s Disease Neuroimaging Initiative (ADNI). Whole brain white matter histogram analysis of diffusion tensor imaging data detects microstructural damage in mild cognitive impairment and alzheimer’s disease patients. J. Magn. Reson. Imaging 2018. [Google Scholar] [CrossRef] [PubMed]
  22. Querbes, O.; Aubry, F.; Pariente, J.; Lotterie, J.A.; Démonet, J.F.; Duret, V.; Puel, M.; Berry, I.; Fort, J.C.; Celsis, P.; et al. Early diagnosis of Alzheimer’s disease using cortical thickness: impact of cognitive reserve. Brain J. Neurol. 2009, 132, 2036–2047. [Google Scholar] [CrossRef] [PubMed]
  23. Stacy, E.W.; Mihram, G.A. Parameter Estimation for a Generalized Gamma Distribution. Technometrics 1965, 7, 349–358. [Google Scholar] [CrossRef]
  24. Garey, M.; Johnson, D.; Witsenhausen, H. The complexity of the generalized Lloyd - Max problem (Corresp.). IEEE Trans. Inf. Theory 1982, 28, 255–256. [Google Scholar] [CrossRef]
  25. Lloyd, S. Least squares quantization in PCM. IEEE Trans. Inf. Theory 1982, 28, 129–137. [Google Scholar] [CrossRef]
  26. Brigant, A.L.; Puechmorel, S. Quantization and clustering on Riemannian manifolds with an application to air traffic analysis. J. Multivar. Anal. 2019, 173, 685–703. [Google Scholar] [CrossRef] [Green Version]
  27. Kaufman, L.; Rousseeuw, P. Clustering by means of Medoids. In Statistical Data Analysis Based on the L1—Norm and Related Methods; Dodge, Y., Ed.; Springer: North-Holland, The Netherlands, 1987; pp. 405–416. [Google Scholar]
  28. Soni, K.G.; Patel, D.A. Comparative Analysis of K-means and K-medoids Algorithm on IRIS Data. Int. J. Comput. Intell. Res. 2017, 13, 899–906. [Google Scholar]
Figure 1. General scheme of the proposed approach.
Figure 1. General scheme of the proposed approach.
Mathematics 07 00674 g001
Table 1. Demographic and clinical characteristics of the study population.
Table 1. Demographic and clinical characteristics of the study population.
HC ( n = 71 )AD ( n = 72 )p-Value
Age (years)76.1 ± 5.677.4 ± 5.50.17
Sex (F/M)38/3341/310.20
MMSE29 ± 0.923.2 ± 2.1<0.001
Plus-minus values are means ± standard deviation. All p-values are based on analysis of variance (ANOVA) test, apart from Sex, which is based on chi-square tests ( α < 0.05). Abbreviations: HC, Healthy Control; AD, Alzheimer’s disease patients; MMSE, Mini Mental State Examination.
Table 2. Performance of clustering with different similarity measures.
Table 2. Performance of clustering with different similarity measures.
SimilarityPurity
DGG10.84
DGG20.84
DG0.78
DM0.8
KL0.78

Share and Cite

MDPI and ACS Style

Rebbah, S.; Nicol, F.; Puechmorel, S. The Geometry of the Generalized Gamma Manifold and an Application to Medical Imaging. Mathematics 2019, 7, 674. https://doi.org/10.3390/math7080674

AMA Style

Rebbah S, Nicol F, Puechmorel S. The Geometry of the Generalized Gamma Manifold and an Application to Medical Imaging. Mathematics. 2019; 7(8):674. https://doi.org/10.3390/math7080674

Chicago/Turabian Style

Rebbah, Sana, Florence Nicol, and Stéphane Puechmorel. 2019. "The Geometry of the Generalized Gamma Manifold and an Application to Medical Imaging" Mathematics 7, no. 8: 674. https://doi.org/10.3390/math7080674

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop