Next Article in Journal
Possibility of Using Entropy Method to Evaluate the Distracting Effect of Mobile Phones on Pedestrians
Next Article in Special Issue
Kernel Density Estimation on the Siegel Space with an Application to Radar Processing
Previous Article in Journal
Texture Segmentation Using Laplace Distribution-Based Wavelet-Domain Hidden Markov Tree Models
Previous Article in Special Issue
Explicit Formula of Koszul–Vinberg Characteristic Functions for a Wide Class of Regular Convex Cones
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Geometric Theory of Heat from Souriau Lie Groups Thermodynamics and Koszul Hessian Geometry: Applications in Information Geometry for Exponential Families

by
Frédéric Barbaresco
Advanced Radar Concepts Business Unit, Thales Air Systems, Limours 91470, France
Entropy 2016, 18(11), 386; https://doi.org/10.3390/e18110386
Submission received: 4 August 2016 / Revised: 17 September 2016 / Accepted: 27 September 2016 / Published: 4 November 2016
(This article belongs to the Special Issue Differential Geometrical Theory of Statistics)

Abstract

:
We introduce the symplectic structure of information geometry based on Souriau’s Lie group thermodynamics model, with a covariant definition of Gibbs equilibrium via invariances through co-adjoint action of a group on its moment space, defining physical observables like energy, heat, and moment as pure geometrical objects. Using geometric Planck temperature of Souriau model and symplectic cocycle notion, the Fisher metric is identified as a Souriau geometric heat capacity. The Souriau model is based on affine representation of Lie group and Lie algebra that we compare with Koszul works on G/K homogeneous space and bijective correspondence between the set of G-invariant flat connections on G/K and the set of affine representations of the Lie algebra of G. In the framework of Lie group thermodynamics, an Euler-Poincaré equation is elaborated with respect to thermodynamic variables, and a new variational principal for thermodynamics is built through an invariant Poincaré-Cartan-Souriau integral. The Souriau-Fisher metric is linked to KKS (Kostant–Kirillov–Souriau) 2-form that associates a canonical homogeneous symplectic manifold to the co-adjoint orbits. We apply this model in the framework of information geometry for the action of an affine group for exponential families, and provide some illustrations of use cases for multivariate gaussian densities. Information geometry is presented in the context of the seminal work of Fréchet and his Clairaut-Legendre equation. The Souriau model of statistical physics is validated as compatible with the Balian gauge model of thermodynamics. We recall the precursor work of Casalis on affine group invariance for natural exponential families.

Lorsque le fait qu’on rencontre est en opposition avec une théorie régnante, il faut accepter le fait et abandonner la théorie, alors même que celle-ci, soutenue par de grands noms, est généralement adoptée
—Claude Bernard in “Introduction à l’Étude de la Médecine Expérimentale” [1]
Au départ, la théorie de la stabilité structurelle m’avait paru d’une telle ampleur et d’une telle généralité, qu’avec elle je pouvais espérer en quelque sorte remplacer la thermodynamique par la géométrie, géométriser en un certain sens la thermodynamique, éliminer des considérations thermodynamiques tous les aspects à caractère mesurable et stochastiques pour ne conserver que la caractérisation géométrique correspondante des attracteurs.
—René Thom in “Logos et théorie des Catastrophes” [2]

1. Introduction

This MDPI Entropy Special Issue on “Differential Geometrical Theory of Statistics” collects a limited number of selected invited and contributed talks presented during the GSI’15 conference on “Geometric Science of Information” in October 2015. This paper is an extended version of the paper [3] “Symplectic Structure of Information Geometry: Fisher Metric and Euler-Poincaré Equation of Souriau Lie Group Thermodynamics” published in GSI’15 Proceedings. At GSI’15 conference, a special session was organized on “lie groups and geometric mechanics/thermodynamics”, dedicated to Jean-Marie Souriau’s works in statistical physics, organized by Gery de Saxcé and Frédéric Barbaresco, and an invited talk on “Actions of Lie groups and Lie algebras on symplectic and Poisson manifolds. Application to Lagrangian and Hamiltonian systems” by Charles-Michel Marle, addressing “Souriau’s thermodynamics of Lie groups”. In honor of Jean-Marie Souriau, who died in 2012 and Claude Vallée [4,5,6], who passed away in 2015, this Special Issue will publish three papers on Souriau’s thermodynamics: Marle’s paper on “From Tools in Symplectic and Poisson Geometry to Souriau’s Theories of Statistical Mechanics and Thermodynamics” [7], de Saxcé’s paper on “Link between Lie Group Statistical Mechanics and Thermodynamics of Continua” [8] and this publication by Barbaresco. This paper also proposes new developments, compared to paper [9] where relations between Souriau and Koszul models have been initiated.
This paper, similar to the goal of the papers of Marle and de Saxcé in this Special Issue, is intended to honor the memory of the French Physicist Jean-Marie Souriau and to popularize his works, currently little known, on statistical physics and thermodynamics. Souriau is well known for his seminal and major contributions in geometric mechanics, the discipline he created in the 1960s, from previous Lagrange’s works that he conceptualized in the framework of symplectic geometry, but very few people know or have exploited Souriau’s works contained in Chapter IV of his book “Structure des systèmes dynamiques” published in 1970 [10] and only translated into English in 1995 in the book “Structure of Dynamical Systems: A Symplectic View of Physics” [11], in which he applied the formalism of geometric mechanics to statistical physics. The personal author’s contribution is to place the work of Souriau in the broader context of the emerging “Geometric Science of Information” [12] (addressed in GSI’15 conference), for which the author will show that the Souriau model of statistical physics is particularly well adapted to generalize “information geometry”, that the author illustrates for exponential densities family and multivariate gaussian densities. The author will observe that the Riemannian metric introduced by Souriau is a generalization of Fisher metric, used in “information geometry”, as being identified to the hessian of the logarithm of the generalized partition function (Massieu characteristic function), for the case of densities on homogeneous manifolds where a non-abelian group acts transively. For a group of time translation, we recover the classical thermodynamics and for the Euclidean space, we recover the classical Fisher metric used in Statistics. The author elaborates a new Euler-Poincaré equation for Souriau’s thermodynamics, action on “geometric heat” variable Q (element of dual Lie algebra), and parameterized by “geometric temperature” (element of Lie algebra). The author will integrate Souriau thermodynamics in a variational model by defining an extended Cartan-Poincaré integral invariant defined by Souriau “geometric characteristic function” (the logarithm of the generalized Souriau partition function parameterized by geometric temperature). These results are illustrated for multivariate Gaussian densities, where the associated group is identified to compute a Souriau moment map and reduce the Euler-Poincaré equation of geodesics. In addition, the symplectic cocycle and Souriau-Fisher metric are deduced from a Lie group thermodynamics model.
The main contributions of the author in this paper are the following:
  • The Souriau model of Lie group thermodynamics is presented with standard notations of Lie group theory, in place of Souriau equations using less classical conventions (that have limited understanding of his work by his contemporaries).
  • We prove that Souriau Riemannian metric introduced with symplectic cocycle is a generalization of Fisher metric (called Souriau-Fisher metric in the following) that preserves the property to be defined as a hessian of partition function logarithm g β = 2 Φ β 2 = 2 log ψ Ω β 2 as in classical information geometry. We then establish the equality of two terms, the first one given by Souriau’s definition from Lie group cocycle Θ and parameterized by “geometric heat” Q (element of dual Lie algebra) and “geometric temperature” β (element of Lie algebra) and the second one, the hessian of the characteristic function Φ ( β ) = log ψ Ω ( β ) with respect to the variable β:
    g β ( [ β , Z 1 ] , [ β , Z 2 ] ) =   Θ ( Z 1 ) , [ β , Z 2 ] + Q , [ Z 1 , [ β , Z 2 ] ] = 2 log ψ Ω β 2
    A dual Souriau-Fisher metric, the inverse of this last one, could be also elaborated with the hessian of “geometric entropy” s ( Q ) with respect to the variable Q: 2 s ( Q ) Q 2
    For the maximum entropy density (Gibbs density), the following three terms coincide: 2 log ψ Ω β 2 that describes the convexity of the log-likelihood function, I ( β ) = E [ 2 log p β ( ξ ) β 2 ] the Fisher metric that describes the covariance of the log-likelihood gradient, whereas I ( β ) = E [ ( ξ Q ) ( ξ Q ) T ] = V a r ( ξ ) that describes the covariance of the observables.
  • This Souriau-Fisher metric is also identified to be proportional to the first derivative of the heat g β = Q β , and then comparable by analogy to geometric “specific heat” or “calorific capacity”.
  • We observe that the Souriau metric is invariant with respect to the action of the group I ( A d g ( β ) ) = I ( β ) , due to the fact that the characteristic function Φ ( β ) after the action of the group is linearly dependent to β . As the Fisher metric is proportional to the hessian of the characteristic function, we have the following invariance:
    I ( A d g ( β ) ) = 2 ( Φ θ ( g 1 ) , β ) β 2 = 2 Φ β 2 = I ( β )
  • We have proposed, based on Souriau’s Lie group model and on analogy with mechanical variables, a variational principle of thermodynamics deduced from Poincaré-Cartan integral invariant. The variational principle holds on g the Lie algebra, for variations δ β = η ˙ + [ β , η ] , where η ( t ) is an arbitrary path that vanishes at the endpoints, η ( a ) = η ( b ) = 0 :
    δ t 0 t 1 Φ ( β ( t ) ) d t = 0
    where the Poincaré-Cartan integral invariant C a Φ ( β ) d t = C b Φ ( β ) d t is defined with Φ ( β ) , the Massieu characteristic function, with the 1-form ω = Φ ( β ) d t = ( Q , β s ) d t = Q , ( β d t ) s d t
  • We have deduced Euler-Poincaré equations for the Souriau model:
    d Q d t = a d β * Q   and   { s ( Q ) = β , Q Φ ( β ) β = s ( Q ) Q g   ,   Q = Φ ( β ) β g *   and   d d t ( A d g * Q ) = 0   with   { g * :   dual   Lie   algebra a d X * Y :   Coadjoint   operator
    where Q is the Souriau geometric heat (element of dual Lie algebra) and β is the Souriau geometric temperature (element of the Lie algebra). The second equation is linked to the result of Souriau based on the moment map that a symplectic manifold is always a coadjoint orbit, affine of its group of Hamiltonian transformations (a symplectic manifold homogeneous under the action of a Lie group, is isomorphic, up to a covering, to a coadjoint orbit; symplectic leaves are the orbits of the affine action that makes the moment map equivariant).
  • We have established that the affine representation of Lie group and Lie algebra by Jean-Marie Souriau is equivalent to Jean-Louis Koszul’s affine representation developed in the framework of hessian geometry of convex sharp cones. Both Souriau and Koszul have elaborated equations requested for Lie group and Lie algebra to ensure the existence of an affine representation. We have compared both approaches of Souriau and Koszul in a table.
  • We have applied the Souriau model for exponential families and especially for multivariate Gaussian densities.
  • We have applied the Souriau-Koszul model Gibbs density to compute the maximum entropy density for symmetric positive definite matrices, using the inner product η , ξ = T r ( η T ξ ) , η , ξ S y m ( n ) given by Cartan-Killing form. The Gibbs density (generalization of Gaussian law for theses matrices and defined as maximum entropy density):
    p ξ ^ ( ξ ) = e Θ 1 ( ξ ^ ) , ξ + Φ ( Θ 1 ( ξ ^ ) ) = ψ Ω ( I d ) [ det ( α ξ ^ 1 ) ] e T r ( α ξ ^ 1 ξ )   with   α = n + 1 2
  • For the case of multivariate Gaussian densities, we have considered G A ( n ) a sub-group of affine group, that we defined by a (n + 1) × (n + 1) embedding in matrix Lie group G a f f , and that acts for multivariate Gaussian laws by:
    [ Y 1 ] = [ R 1 / 2 m 0 1 ] [ X 1 ] = [ R 1 / 2 X + m 1 ] ,   { ( m , R ) R n × S y m + ( n ) M = [ R 1 / 2 m 0 1 ] G a f f   X ( 0 , I ) Y ( m , R )
  • For multivariate Gaussian densities, as we have identified the acting sub-group of affine group M , we have also developed the computation of the associated Lie algebras η L and η R , adjoint and coadjoint operators, and especially the Souriau “moment map” Π R :
    n L , M 1 n R M = Π R , n R with   M = [ R 1 / 2 m 0 1 ]   ,   n L = [ R 1 / 2 R ˙ 1 / 2 R 1 / 2 m ˙ 0 0 ]   and   η R = [ R 1 / 2 R ˙ 1 / 2 m ˙ R 1 / 2 R ˙ 1 / 2 m ˙ 0 0 ] Π R = [ R 1 / 2 R ˙ 1 / 2 + R 1 m ˙ m T R 1 m ˙ 0 0 ]
    Using Souriau Theorem (geometrization of Noether theorem), we use the property that this moment map Π R is constant (its components are equal to Noether invariants):
    d Π R d t = 0 { R 1 R ˙ + R 1 m ˙ m T = B = c s t e R 1 m ˙ = b = c s t e
    to reduce the Euler-Lagrange equation of geodesics between two multivariate Gaussian densities:
    { R ¨ + m ˙ m ˙ T R ˙ R 1 R ˙ = 0 m ¨ R ˙ R 1 m ˙ = 0
    to this reduced equation of geodesics:
    { m ˙ = R b R ˙ = R ( B b m T )
    that we solve by “geodesic shooting” technic based on Eriksen equation of exponential map.
  • For the families of multivariate Gaussian densities, that we have identified as homogeneous manifold with the associated sub-group of the affine group [ R 1 / 2 m 0 1 ] , we have considered the elements of exponential families, that play the role of geometric heat Q in Souriau Lie group thermodynamics, and β the geometric (Planck) temperature:
    Q = ξ ^ = [ E [ z ] E [ z z T ] ] = [ m R + m m T ]   ,   β = [ R 1 m 1 2 R 1 ]
    We have considered that these elements are homeomorph to the (n + 1) × (n + 1) matrix elements:
    Q = ξ ^ = [ R + m m T m 0 0 ] g *   ,   β = [ 1 2 R 1 R 1 m 0 0 ]   g
    to compute the Souriau symplectic cocycle of the Lie group:
    θ ( M ) = ξ ^ ( A d M ( β ) ) A d M * ξ ^
    where the adjoint operator is equal to:
    A d M β = [ 1 2 Ω 1 Ω 1 n 0 0 ]   with   Ω = R 1 / 2 R R 1 / 2   and   n = ( 1 2 m + R 1 / 2 m )
    with
    ξ ^ ( A d M ( β ) ) = [ Ω + n n T n 0 0 ]
    and the co-adjoint operator:
    A d M * ξ ^ = [ R + m m T m m T R 1 / 2 m 0 0 ]
  • Finally, we have computed the Souriau-Fisher metric g β ( [ β , Z 1 ] , [ β , Z 2 ] ) = Θ ˜ β ( Z 1 , [ β , Z 2 ] ) for multivariate Gaussian densities, given by:
    g β ( [ β , Z 1 ] , [ β , Z 2 ] ) = Θ ˜ β ( Z 1 , [ β , Z 2 ] ) = Θ ˜ ( Z 1 , [ β , Z 2 ] ) + ξ ^ , [ Z 1 , [ β , Z 2 ] ] =   Θ ( Z 1 ) , [ β , Z 2 ] + ξ ^ , [ Z 1 , [ β , Z 2 ] ]
    with element of Lie algebra given by Z = [ 1 2 Ω 1 Ω 1 n 0 0 ] .
The plan of the paper is as follows. After this introduction in Section 1, we develop in Section 2 the position of Souriau symplectic model of statistical physics in the historical developments of thermodynamic concepts. In Section 3, we develop and revisit the Lie group thermodynamics model of Jean-Marie Souriau in modern notations. In Section 4, we make the link between Souriau Riemannian metric and Fisher metric defined as a geometric heat capacity of Lie group thermodynamics. In Section 5, we elaborate Euler-Lagrange equations of Lie group thermodynamics and a variational model based on Poincaré-Cartan integral invariant. In Section 6, we explore Souriau affine representation of Lie group and Lie algebra (including the notions of: affine representations and cocycles, Souriau moment map and cocycles, equivariance of Souriau moment map, action of Lie group on a symplectic manifold and dual spaces of finite-dimensional Lie algebras) and we analyze the link and parallelisms with Koszul affine representation, developed in another context (comparison is synthetized in a table). In Section 7, we illustrate Koszul and Souriau Lie group models of information geometry for multivariate Gaussian densities. In Section 8, after identifying the affine group acting for these densities, we compute the Souriau moment map to obtain the Euler-Poincaré equation, solved by geodesic shooting method. In Section 9, Souriau Riemannian metric defined by cocycle for multivariate Gaussian densities is computed. We give a conclusion in Section 10 with research prospects in the framework of affine Poisson geometry [13], Bismut stochastic mechanics [14] and second order extension of the Gibbs state [15,16]. We have three appendices: Appendix A develops the Clairaut(-Legendre) equation of Maurice Fréchet associated to “distinguished functions” as a seminal equation of information geometry; Appendix B is about a Balian Gauge model of thermodynamics and its compliance with the Souriau model; Appendix C is devoted to the link of Casalis-Letac’s works on affine group invariance for natural exponential families with Souriau’s works.

2. Position of Souriau Symplectic Model of Statistical Physics in Historical Developments of Thermodynamic Concepts

In this Section, we will explain the emergence of thermodynamic concepts that give rise to the generalization of the Souriau model of statistical physics. To understand Souriau’s theoretical model of heat, we have to consider first his work in geometric mechanics where he introduced the concept of “moment map” and “symplectic cohomology”. We will then introduce the concept of “characteristic function” developed by François Massieu, and generalized by Souriau on homogeneous symplectic manifolds. In his statistical physics model, Souriau has also generalized the notion of “heat capacity” that was initially extended by Pierre Duhem as a key structure to jointly consider mechanics and thermodynamics under the umbrella of the same theory. Pierre Duhem has also integrated, in the corpus, the Massieu’s characteristic function as a thermodynamic potential. Souriau’s idea to develop a covariant model of Gibbs density on homogeneous manifold was also influenced by the seminal work of Constantin Carathéodory that axiomatized thermodynamics in 1909 based on Carnot’s works. Souriau has adapted his geometric mechanical model for the theory of heat, where Henri Poincaré did not succeed in his paper on attempts of mechanical explanation for the principles of thermodynamics.
Lagrange’s works on “mécanique analytique (analytic mechanics)” has been interpreted by Jean-Marie Souriau in the framework of differential geometry and has initiated a new discipline called after Souriau, “mécanique géométrique (geometric mechanics)” [17,18,19]. Souriau has observed that the collection of motions of a dynamical system is a manifold with an antisymmetric flat tensor that is a symplectic form where the structure contains all the pertinent information of the state of the system (positions, velocities, forces, etc.). Souriau said: “Ce que Lagrange a vu, que n’a pas vu Laplace, c’était la structure symplectique (What Lagrange saw, that Laplace didn’t see, was the symplectic structure” [20]. Using the symmetries of a symplectic manifold, Souriau introduced a mapping which he called the “moment map” [21,22,23], which takes its values in a space attached to the group of symmetries (in the dual space of its Lie algebra). He [10] called dynamical groups every dimensional group of symplectomorphisms (an isomorphism between symplectic manifolds, a transformation of phase space that is volume-preserving), and introduced Galileo group for classical mechanics and Poincaré group for relativistic mechanics (both are sub-groups of affine group [24,25]). For instance, a Galileo group could be represented in a matrix form by (with A rotation, b the boost, c space translation and e time translation):
[ x t 1 ] = [ A b c 0 1 e 0 0 1 ] GALILEO   GROUP [ x t 1 ]   with   { A S O ( 3 ) b , c R 3 e R   ,   Lie   Algebra   [ ω η γ 0 0 ε 0 0 0 ]   with   { ω s o ( 3 )   η , γ R 3 ε R +
Souriau associated to this moment map, the notion of symplectic cohomology, linked to the fact that such a moment is defined up to an additive constant that brings into play an algebraic mechanism (called cohomology). Souriau proved that the moment map is a constant of the motion, and provided geometric generalization of Emmy Noether invariant theorem (invariants of E. Noether theorem are the components of the moment map). For instance, Souriau gave an ontological definition of mass in classical mechanics as the measure of the symplectic cohomology of the action of the Galileo group (the mass is no longer an arbitrary variable but a characteristic of the space). This is no longer true for Poincaré group in relativistic mechanics, where the symplectic cohomology is null, explaining the lack of conservation of mass in relativity. All the details of classical mechanics thus appear as geometric necessities, as ontological elements. Souriau has also observed that the symplectic structure has the property to be able to be reconstructed from its symmetries alone, through a 2-form (called Kirillov–Kostant–Souriau form) defined on coadjoint orbits. Souriau said that the different versions of mechanical science can be classified by the geometry that each implies for space and time; geometry is determined by the covariance of group theory. Thus, Newtonian mechanics is covariant by the group of Galileo, the relativity by the group of Poincaré; General relativity by the “smooth” group (the group of diffeomorphisms of space-time). However, Souriau added “However, there are some statements of mechanics whose covariance belongs to a fourth group rarely considered: the affine group, a group shown in the following diagram for inclusion. How is it possible that a unitary point of view (which would be necessarily a true thermodynamics), has not yet come to crown the picture? Mystery...” [26]. See Figure 1.
As early as 1966, Souriau applied his theory to statistical mechanics, developed it in the Chapter IV of his book “Structure of Dynamical Systems” [11], and elaborated what he called a “Lie group thermodynamics” [10,11,27,28,29,30,31,32,33,34,35,36,37]. Using Lagrange’s viewpoint, in Souriau statistical mechanics, a statistical state is a probability measure on the manifold of motions (and no longer in phase space [38]). Souriau observed that Gibbs equilibrium [39] is not covariant with respect to dynamic groups of Physics. To solve this braking of symmetry, Souriau introduced a new “geometric theory of heat” where the equilibrium states are indexed by a parameter β with values in the Lie algebra of the group, generalizing the Gibbs equilibrium states, where β plays the role of a geometric (Planck) temperature. The invariance with respect to the group, and the fact that the entropy s is a convex function of this geometric temperature β , imposes very strict, universal conditions (e.g., there exists necessarily a critical temperature beyond which no equilibrium can exist). Souriau observed that the group of time translations of the classical thermodynamics [40,41] is not a normal subgroup of the Galilei group, proving that if a dynamical system is conservative in an inertial reference frame, it need not be conservative in another. Based on this fact, Souriau generalized the formulation of the Gibbs principle to become compatible with Galileo relativity in classical mechanics and with Poincaré relativity in relativistic mechanics. The maximum entropy principle [42,43,44,45,46,47,48,49,50,51] is preserved, and the Gibbs density is given by the density of maximum entropy (among the equilibrium states for which the average value of the energy takes a prescribed value, the Gibbs measures are those which have the largest entropy), but with a new principle “If a dynamical system is invariant under a Lie subgroup G’ of the Galileo group, then the natural equilibria of the system forms the Gibbs ensemble of the dynamical group G’” [10]. The classical notion of Gibbs canonical ensemble is extended for a homogneous symplectic manifold on which a Lie group (dynamic group) has a symplectic action. When the group is not abelian (non-commutative group), the symmetry is broken, and new “cohomological” relations should be verified in Lie algebra of the group [52,53,54,55]. A natural equilibrium state will thus be characterized by an element of the Lie algebra of the Lie group, determining the equilibrium temperature β . The entropy s ( Q ) , parametrized by Q the geometric heat (mean of energy U , element of the dual Lie algebra) is defined by the Legendre transform [56,57,58,59] of the Massieu potential Φ ( β ) parametrized by β ( Φ ( β ) is the minus logarithm of the partition function ψ Ω ( β ) ):
s ( Q ) = β , Q Φ ( β )   with   { Q = Φ β g * β = s Q g
p G i b b s ( ξ ) = e Φ ( β ) β , U ( ξ ) = e β , U ( ξ ) M e β , U ( ξ ) d ω   ,   Q = Φ ( β ) β = M U ( ξ ) e β , U ( ξ ) d ω M e β , U ( ξ ) d ω = M U ( ξ ) p ( ξ ) d ω with   Φ ( β ) = log M e β , U ( ξ ) d ω
Souriau completed his “geometric heat theory” by introducing a 2-form in the Lie algebra, that is a Riemannian metric tensor in the values of adjoint orbit of β , [ β , Z ] with Z an element of the Lie algebra. This metric is given for ( β , Q ) :
g β ( [ β , Z 1 ] , [ β , Z 2 ] ) =   Θ ( Z 1 ) , [ β , Z 2 ] + Q , [ Z 1 , [ β , Z 2 ] ]
where Θ is a cocycle of the Lie algebra, defined by Θ = T e θ with θ a cocycle of the Lie group defined by θ ( M ) = Q ( A d M ( β ) ) A d M * Q . We have observed that this metric g β is also given by the hessian of the Massieu potential g β = 2 Φ β 2 = log ψ Ω β 2 as Fisher metric in classical information geometry theory [60], and so this is a generalization of the Fisher metric for homogeneous manifold. We call this new metric the Souriau-Fisher metric. As g β = Q β , Souriau compared it by analogy with classical thermodynamics to a “geometric specific heat” (geometric calorific capacity).
The potential theory of thermodynamics and the introduction of “characteristic function” (previous function Φ ( β ) = log ψ Ω ( β ) in Souriau theory) was initiated by François Jacques Dominique Massieu [61,62,63,64]. Massieu was the son of Pierre François Marie Massieu and Thérèse Claire Castel. He married in 1862 with Mlle Morand and had 2 children. He graduated from Ecole Polytechnique in 1851 and Ecole des Mines de Paris in 1956, he has integrated “Corps des Mines”. He defended his Ph.D. in 1861 on “Sur les intégrales algébriques des problèmes de mécanique” and on “Sur le mode de propagation des ondes planes et la surface de l’onde élémentaire dans les cristaux biréfringents à deux axes” [65] with the jury composed of Lamé, Delaunay et Puiseux. In 1870, François Massieu presented his paper to the French Academy of Sciences on “characteristic functions of the various fluids and the theory of vapors” [61]. The design of the characteristic function is the finest scientific title of Mr. Massieu. A prominent judge, Joseph Bertrand, do not hesitate to declare, in a statement read to the French Academy of Sciences 25 July 1870, that “the introduction of this function in formulas that summarize all the possible consequences of the two fundamental theorems seems, for the theory, a similar service almost equivalent to that Clausius has made by linking the Carnot’s theorem to entropy” [66]. The final manuscript was published by Massieu in 1873, “Exposé des principes fondamentaux de la théorie mécanique de la chaleur (Note destinée à servir d’introduction au Mémoire de l’auteur sur les fonctions caractéristiques des divers fluides et la théorie des vapeurs)” [63].
Massieu introduced the following potential Φ ( β ) , called “characteristic function”, as illustrated in Figure 2, that is the potential used by Souriau to generalize the theory: s ( Q ) = β , Q Φ ( β ) β = 1 T Φ = Q T S . However, in his third paper, Massieu was influenced by M. Bertrand, as illustrated in Figure 3, to replace the variable β = 1 T (that he used in his two first papers) by T . We have then to wait 50 years more for the paper of Planck, who introduced again the good variable β = 1 T , and then generalized by Souriau, giving to Planck temperature β an ontological and geometric status as element of the Lie algebra of the dynamic group.
This Lie group thermodynamics of Souriau is able to explain astronomical phenomenon (rotation of celestial bodies: the Earth and the stars rotating about themselves). The geometric temperature β can be also interpreted as a space-time vector (generalization of the temperature vector of Planck), where the temperature vector and entropy flux are in duality unifying heat conduction and viscosity (equations of Fourier and Navier). In case of centrifuge system (e.g., used for enrichment of uranium), the Gibbs Equilibrium state [60,67] are given by Souriau equations as the variation in concentration of the components of an inhomogeneous gas. Classical statistical mechanics corresponds to the dynamical group of time translations, for which we recover from Souriau equations the concepts and principles of classical thermodynamics (temperature, energy, heat, work, entropy, thermodynamic potentials) and of the kinetic theory of gases (pressure, specific heats, Maxwell’s velocity distribution, etc.).
Souriau also studied continuous medium thermodynamics, where the “temperature vector” is no longer constrained to be in Lie algebra, but only contrained by phenomenologic equations (e.g., Navier equations, etc.). For thermodynamic equilibrium, the “temperature vector” is then a Killing vector of Space-Time. For each point X, there is a “temperature vector” β ( X ) , such it is an infinitesimal conformal transform of the metric of the universe g i j . Conservation equations can then be deduced for components of impulsion-energy tensor T i j and entropy flux S j with ^ i T i j = 0   and   i S j = 0 . Temperature and metric are related by the following equations:
{ ^ i β j + ^ j β i = λ g i j i β j + j β i 2 Γ i j k β k = λ g i j   with   { ^ i . :   covariant   derivative β j :   component   of   Temperature   vector λ = 0   Killing   Equation
Leon Brillouin made the link between Boltzmann entropy and Negentropie of information theory [68,69,70,71], but before Jean-Marie Souriau, only Constantin Carathéodory and Pierre Duhem [72,73,74,75] initiated first theoretical works to generalize thermodynamics.
After three years as lecturer at Lille university, Duhem published a paper in the official revue of the Ecole Normale Supérieure, in 1891, “On general equations of thermodynamics” [72] (Sur les équations générales de la Thermodynamique) in Annales Scientifiques de l’Ecole Normale Supérieure. Duhem generalized the concept of “virtual work” under the action of “external actions” by taking into account both mechanical and thermal actions. In 1894, the design of a generalized mechanics based on thermodynamics was further developed: ordinary mechanics had already become “a particular case of a more general science”. Duhem writes “We made dynamics a special case of thermodynamics, a science that embraces common principles in all changes of state bodies, changes of places as well as changes in physical qualities” (Nous avons fait de la dynamique un cas particulier de la thermodynamique, une Science qui embrasse dans des principes communs tous les changements d’état des corps, aussi bien les changements de lieu que les changements de qualités physiques). In the equations of his generalized mechanics-thermodynamics, some new terms had to be introduced, in order to account for the intrinsic viscosity and friction of the system. As observed by Stefano Bordoni, Duhem aimed at widening the scope of physics: the new physics could not confine itself to “local motion” but had to describe what Duhem qualified “motions of modification”. If Boltzmann had tried to proceed from “local motion” to attain the explanation of more complex transformations, Duhem was trying to proceed from general laws concerning general transformation in order to reach “local motion” as a simplified specific case. Four scientists were credited by Duhem with having carried out “the most important researches on that subject”: Massieu had managed to derive thermodynamics from a “characteristic function and its partial derivatives”; Gibbs had shown that Massieu’s functions “could play the role of potentials in the determination of the states of equilibrium” in a given system; von Helmholtz had put forward “similar ideas”; von Oettingen had given “an exposition of thermodynamics of remarkable generality” based on general duality concept in “Die thermodynamischen Beziehungen antithetisch entwickelt” published at St. Petersburg in 1885. Duhem took into account a system whose elements had the same temperature and where the state of the system could be completely specified by giving its temperature and n other independent quantities. He then introduced some “external forces”, and held the system in equilibrium. A virtual work corresponded to such forces, and a set of n + 1 equations corresponded to the condition of equilibrium of the physical system. From the thermodynamic point of view, every infinitesimal transformation involving the generalized displacements had to obey to the first law, which could be expressed in terms of the (n + 1) generalized Lagrangian parameters. The amount of heat could be written as a sum of (n + 1) terms. The new alliance between mechanics and thermodynamics led to a sort of symmetry between thermal and mechanical quantities. The n + 1 functions played the role of generalized thermal capacities, and the last term was nothing other than the ordinary thermal capacity. The knowledge of the “equilibrium equations of a system” allowed Duhem to compute the partial derivatives of the thermal capacity with regard to all the parameters which described the state of the system, apart from its derivative with regard to temperature. The thermal capacities were therefore known “except for an unspecified function of temperature”.
The axiomatic approach of thermodynamics was published in 1909 in Mathematische Annalen [76] under the title “Examination of the Foundations of Thermodynamics” (Untersuchungen überdie Grundlagen der Thermodynamik) by Constantin Carathéodory based on Carnot’s works [77]. Carathéodory introduced entropy through a mathematical approach based on the geometric behavior of a certain class of partial differential equations called Pfaffians. Carathéodory’s investigations start by revisiting the first law and reformulating the second law of thermodynamics in the form of two axioms. The first axiom applies to a multiphase system change under adiabatic conditions (axiom of classical thermodynamics due to Clausius [78,79]). The second axiom assumes that in the neighborhood of any equilibrium state of a system (of any number of thermodynamic coordinates), there exist states that are inaccessible by reversible adiabatic processes. In the book of Misha Gromov “Metric Structures for Riemannian and Non-Riemannian Spaces”, written and edited by Pierre Pansu and Jacques Lafontaine, a new metric is introduced called “Carnot-Carathéodory metric”. In one of his papers, Misha Gromov [80,81] gives historical remarks “This result (which seems obvious by the modern standards) appears (in a more general form) in the 1909-paper by Carathéorody on formalization of the classical thermodynamics where horizontal curves roughly correspond to adiabatic processes. In fact, the above proof may be performed in the language of Carnot (cycles) and for this reason the metris distH were christened ‘Carnot-Carathéodory’ in Gromov-Lafontaine-Pansu book” [82]. When I ask this question to Pierre Pansu, he gave me the answer that “The section 4 of [76], entitled Hilfsatz aus der Theorie des Pfaffschen Gleichungen (Lemma from the theory of Pfaffian equations) opens with a statement relating to the differential 1-forms. Carathéodory says, If a Pfaffian equation dx0 + X1 dx1 + X2 dx2 + … + Xn dxn = 0 is given, in which the Xi are finite, continuous, differentiable functions of the xi, and one knows that in any neighborhood of an arbitrary point P of the space of xi there is a point that one cannot reach along a curve that satisfies this equation then the expression must necessarily possess a multiplier that makes it into a complete differential”. This is confirmed in the introduction of his paper [76], where Carathéodory said “Finally, in order to be able to treat systems with arbitrarily many degrees of freedom from the outset, instead of the Carnot cycle that is almost always used, but is intuitive and easy to control only for systems with two degrees of freedom, one must employ a theorem from the theory of Pfaffian differential equations, for which a simple proof is given in the fourth section”.
We have also to make reference to Henri Poincaré [83] that published the paper “On attempts of mechanical explanation for the principles of thermodynamics (Sur les tentatives d’explication mécanique des principes de la thermodynamique)” at the Comptes rendus de l’Académie des sciences in 1889 [84], in which he tried to consolidate links between mechanics and thermomechanics principles. These elements were also developed in Poincaré’s lecture of 1892 [85] on “thermodynamique” in Chapter XVII “Reduction of thermodynamics principles to the general principles of mechanics (Réduction des principes de la Thermodynamique aux principes généraux de la mécanique)”. Poincaré writes in his book [85] “It is otherwise with the second law of thermodynamics. Clausius was the first to attempt to bring it to the principles of mechanics, but not succeed satisfactorily. Helmholtz in his memoir on the principle of least actions developed a theory much more perfect than that of Clausius. However, it cannot account for irreversible phenomena. (Il en est autrement du second principe de la thermodynamique. Clausius, a le premier, tenté de le ramener aux principes de la Mécanique, mais sans y réussir d’une manière satisfaisante. Helmoltz dans son mémoire sur le principe de la moindre action, a développé une théorie beaucoup plus parfaite que celle de Clausius; cependant elle ne peut rendre compte des phénomènes irréversibles.). About Helmoltz work, Poincaré observes [85] “It follows from these examples that the Helmholtz hypothesis is true in the case of body turning around an axis; So it seems applicable to vortex motions of molecules (Il résulte de ces exemples que l’hypothèse d’Helmoltz est exacte dans le cas de corps tournant autour d’un axe; elle parait donc applicable aux mouvements tourbillonnaires des molecules.)”, but he adds in the following that the Helmoltz model is also true in the case of vibrating motions as molecular motions. However, he finally observes that the Helmoltz model cannot explain the increasing of entropy and concludes [85] “All attempts of this nature must be abandoned; the only ones that have any chance of success are those based on the intervention of statistical laws, for example, the kinetic theory of gases. This view, which I cannot develop here, can be summed up in a somewhat vulgar way as follows: Suppose we want to place a grain of oats in the middle of a heap of wheat; it will be easy; then suppose we wanted to find it and remove it; we cannot achieve it. All irreversible phenomena, according to some physicists, would be built on this model (Toutes les tentatives de cette nature doivent donc être abandonnées; les seules qui aient quelque chance de succès sont celles qui sont fondées sur l’intervention des lois statistiques comme, par exemple, la théorie cinétique des gaz. Ce point de vue, que je ne puis développer ici, peut se résumer d’une façon un peu vulgaire comme il suit: Supposons que nous voulions placer un grain d’avoine au milieu d’un tas de blé; cela sera facile; supposons que nous voulions ensuite l’y retrouver et l’en retirer; nous ne pourrons y parvenir. Tous les phénomènes irréversibles, d’après certains physiciens, seraient construits sur ce modèle)”. In Poincaré’s lecture, Massieu has greatly influenced Poincaré to introduce Massieu characteristic function in probability [86]. As we have observed, Poincaré has introduced characteristic function in probability lecture after his lecture on thermodynamics where he discovered in its second edition [85], the Massieu’s characteristic function. We can read that “Since from the functions of Mr. Massieu one can deduce other functions of variables, all equations of thermodynamics can be written so as to only contain these functions and their derivatives; it will thus result in some cases, a great simplification (Puisque des fonctions de M. Massieu on peut déduire les autres fonctions des variables, toutes les équations de la Thermodynamique pourront s’écrire de manière à ne plus renfermer que ces fonctions et leurs dérivées; il en résultera donc, dans certains cas, une grande simplification).” [85]. He [85] added “MM. Gibbs von Helmholtz, Duhem have used this function H = U − TS assuming that T and V are constant. Mr. von Helmotz has called it ‘free energy’ and also proposes to give him the name of “kinetic potential”; Duhem called it ‘the thermodynamic potential at constant volume’; this is the most justified naming (MM. Gibbs, von Helmoltz, Duhem ont fait usage de cette function H = TS − U en y supposant T et V constants. M. von Helmotz l’a appellée énergie libre et a propose également de lui donner le nom de potential kinetique; M. Duhem la nomme potentiel thermodynamique à volume constant; c’est la dénomination la plus justifiée)”. In 1906, Henri Poincaré also published a note [87] “Reflection on The kinetic theory of gases” (Réflexions sur la théorie cinétique des gaz), where he said that: “The kinetic theory of gases leaves awkward points for those who are accustomed to mathematical rigor … One of the points which embarrassed me most was the following one: it is a question of demonstrating that the entropy keeps decreasing, but the reasoning of Gibbs seems to suppose that having made vary the outside conditions we wait that the regime is established before making them vary again. Is this supposition essential, or in other words, we could arrive at opposite results to the principle of Carnot by making vary the outside conditions too fast so that the permanent regime has time to become established?”.
Jean-Marie Souriau has elaborated a disruptive and innovative “théorie géométrique de la chaleur (geometric theory of heat)” [88] after the works of his predecessors as illustrated in Figure 4: “théorie analytique de la chaleur (analytic theory of heat)” by Jean Baptiste Joseph Fourier [88], “théorie mécanique de la chaleur (mechanic theory of heat)” by François Clausius [89] and François Massieu and “théorie mathématique de la chaleur (mathematic theory of heat)” by Siméon-Denis Poisson [90,91], as illustrated in this figure:

3. Revisited Souriau Symplectic Model of Statistical Physics

In this Section, we will revisit the Souriau model of thermodynamics but with modern notations, replacing personal Souriau conventions used in his book of 1970 by more classical ones.
In 1970, Souriau introduced the concept of co-adjoint action of a group on its momentum space (or “moment map”: mapping induced by symplectic manifold symmetries), based on the orbit method works, that allows to define physical observables like energy, heat and momentum or moment as pure geometrical objects (the moment map takes its values in a space determined by the group of symmetries: the dual space of its Lie algebra). The moment(um) map is a constant of the motion and is associated to symplectic cohomology (assignment of algebraic invariants to a topological space that arises from the algebraic dualization of the homology construction). Souriau introduced the moment map in 1965 in a lecture notes at Marseille University and published it in 1966. Souriau gave the formal definition and its name based on its physical interpretation in 1967. Souriau then studied its properties of equivariance, and formulated the coadjoint orbit theorem in his book in 1970. However, in his book, Souriau also observed in Chapter IV that Gibbs equilibrium states are not covariant by dynamical groups (Galileo or Poincaré groups) and then he developed a covariant model that he called “Lie group thermodynamics”, where equilibriums are indexed by a “geometric (Planck) temperature”, given by a vector β that lies in the Lie algebra of the dynamical group. For Souriau, all the details of classical mechanics appear as geometric necessities (e.g., mass is the measure of the symplectic cohomology of the action of a Galileo group). Based on this new covariant model of thermodynamic Gibbs equilibrium, Souriau has formulated statistical mechanics and thermodynamics in the framework of symplectic geometry by use of symplectic moments and distribution-tensor concepts, giving a geometric status for temperature, heat and entropy.
There is a controversy about the name “momentum map” or “moment map”. Smale [92] referred to this map as the “angular momentum”, while Souriau used the French word “moment”. Cushman and Duistermaat [93] have suggested that the proper English translation of Souriau’s French word was “momentum” which fit better with standard usage in mechanics. On the other hand, Guillemin and Sternberg [94] have validated the name given by Souriau and have used “moment” in English. In this paper, we will see that name “moment” given by Souriau was the most appropriate word. In his Chapter IV of his book [10], studying statistical mechanics, Souriau [10] has ingeniously observed that moments of inertia in mechanics are equivalent to moments in probability in his new geometric model of statistical physics. We will see that in Souriau Lie group thermodynamic model, these statistical moments will be given by the energy and the heat defined geometrically by Souriau, and will be associated with “moment map” in dual Lie algebra.
This work has been extended by Claude Vallée [5,6] and Gery de Saxcé [4,8,95,96]. More recently, Kapranov has also given a thermodynamical interpretation of the moment map for toric varieties [97] and Pavlov, thermodynamics from the differential geometry standpoint [98].
The conservation of the moment of a Hamiltonian action was called by Souriau the “symplectic or geometric Noether theorem”. Considering phases space as symplectic manifold, cotangent fiber of configuration space with canonical symplectic form, if Hamiltonian has Lie algebra, then the moment map is constant along the system integral curves. Noether theorem is obtained by considering independently each component of the moment map.
In a first step to establish new foundations of thermodynamics, Souriau [10] has defined a Gibbs canonical ensemble on a symplectic manifold M for a Lie group action on M. In classical statistical mechanics, a state is given by the solution of Liouville equation on the phase space, the partition function. As symplectic manifolds have a completely continuous measure, invariant by diffeomorphisms, the Liouville measure λ, all statistical states will be the product of the Liouville measure by the scalar function given by the generalized partition function e Φ ( β ) β , U ( ξ ) defined by the energy U (defined in the dual of the Lie algebra of this dynamical group) and the geometric temperature β , where Φ is a normalizing constant such the mass of probability is equal to 1, Φ ( β ) = log M e β , U ( ξ ) d λ [99]. Jean-Marie Souriau then generalizes the Gibbs equilibrium state to all symplectic manifolds that have a dynamical group. To ensure that all integrals that will be defined could converge, the canonical Gibbs ensemble is the largest open proper subset (in Lie algebra) where these integrals are convergent. This canonical Gibbs ensemble is convex. The derivative of Φ , Q = Φ β (thermodynamic heat) is equal to the mean value of the energy U . The minus derivative of this generalized heat Q , K = Q β is symmetric and positive (this is a geometric heat capacity). Entropy s is then defined by Legendre transform of Φ , s = β , Q Φ . If this approach is applied for the group of time translation, this is the classical thermodynamics theory. However, Souriau [10] has observed that if we apply this theory for non-commutative group (Galileo or Poincaré groups), the symmetry has been broken. Classical Gibbs equilibrium states are no longer invariant by this group. This symmetry breaking provides new equations, discovered by Souriau [10].
We can read in his paper this prophetical sentence “This Lie group thermodynamics could be also of first interest for mathematics (Peut-être cette Thermodynamique des groups de Lie a-t-elle un intérêt mathématique)” [30]. He explains that for the dynamic Galileo group with only one axe of rotation, this thermodynamic theory is the theory of centrifuge where the temperature vector dimension is equal to 2 (sub-group of invariance of size 2), used to make “uranium 235” and “ribonucleic acid” [30]. The physical meaning of these two dimensions for vector-valued temperature is “thermic conduction” and “viscosity”. Souriau said that the model unifies “heat conduction” and “viscosity” (Fourier and Navier equations) in the same theory of irreversible process. Souriau has applied this theory in detail for relativistic ideal gas with the Poincaré group for the dynamical group.
Before introducing the Souriau Model of Lie group thermodynamics, we will first remind readers of the classical notation of Lie group theory in their application to Lie group thermodynamics:
  • The coadjoint representation of G is the contragredient of the adjoint representation. It associates to each g G the linear isomorphism A d g * G L ( g * ) , which satisfies, for each ξ g * and X g :
    A d g 1 * ( ξ ) , X = ξ , A d g 1 ( X )
  • The adjoint representation of the Lie algebra g is the linear representation of g into itself which associates, to each X g , the linear map a d X g l ( g ) . a d Tangent application of A d at neutral element e of G :
    a d = T e A d : T e G E n d ( T e G ) X , Y T e G a d X ( Y ) = [ X , Y ]
  • The coadjoint representation of the Lie algebra g is the contragredient of the adjoint representation. It associates, to each X g , the linear map a d X * g l ( g * ) which satisfies, for each ξ g * and X g :
    a d X * ( ξ ) , Y = ξ , A d X ( Y )
    We can illustrate for group of matrices for G = G L n ( K ) with K = R   or   C .
    T e G = M n ( K ) ,   X M n ( K ) , g G     A d g ( X ) = g X g 1
    X , Y M n ( K )   a d X ( Y ) = ( T e A d ) X ( Y ) = X Y Y X = [ X , Y ]
    Then, the curve from e = I d = c ( 0 ) tangent to X = c ( 1 ) is given by c ( t ) = exp ( t X ) and transform by A d : γ ( t ) = A d exp ( t X )
    a d X ( Y ) = ( T e A d ) X ( Y ) = d d t γ ( t ) Y | t = 0 = d d t exp ( t X ) Y exp ( t X ) 1 | t = 0 = X Y Y X
    For each temperature β , element of the Lie algebra g , Souriau has introduced a tensor Θ ˜ β , equal to the sum of the cocycle Θ ˜ and the heat coboundary (with [.,.] Lie bracket):
    Θ ˜ β ( Z 1 , Z 2 ) = Θ ˜ ( Z 1 , Z 2 ) + Q , a d Z 1 ( Z 2 )     with   a d Z 1 ( Z 2 ) = [ Z 1 , Z 2 ]
    This tensor Θ ˜ β has the following properties:
  • Θ ˜ ( X , Y ) = Θ ( X ) , Y where the map Θ is the one-cocycle of the Lie algebra g with values in g * , with Θ ( X ) = T e θ ( X ( e ) ) where θ the one-cocycle of the Lie group G. Θ ˜ ( X , Y ) is constant on M and the map Θ ˜ ( X , Y ) : g × g is a skew-symmetric bilinear form, and is called the symplectic cocycle of Lie algebra g associated to the moment map J , with the following properties:
    Θ ˜ ( X , Y ) = J [ X , Y ] { J X , J Y }   with   { . , . }   Poisson   Bracket   and   J   the   Moment   Map
    Θ ˜ ( [ X , Y ] , Z ) + Θ ˜ ( [ Y , Z ] , X ) + Θ ˜ ( [ Z , X ] , Y ) = 0
    where J X linear application from g to differential function on M : g C ( M , R ) X J X and the associated differentiable application J , called moment(um) map:
    J : M g *   such   that   J X ( x ) = J ( x ) , X ,   X g x J ( x )
    If instead of J we take the following moment map: J ( x ) = J ( x ) + Q   ,   x M
    where Q g * is constant, the symplectic cocycle θ is replaced by θ ( g ) = θ ( g ) + Q A d g * Q
    where θ θ = Q A d g * Q is one-coboundary of G with values in g * . We also have properties θ ( g 1 g 2 ) = A d g 1 * θ ( g 2 ) + θ ( g 1 ) and θ ( e ) = 0 .
  • The geometric temperature, element of the algebra g , is in the thekernel of the tensor Θ ˜ β :
    β K e r   Θ ˜ β ,   such   that   Θ ˜ β ( β , β ) = 0   ,   β g  
  • The following symmetric tensor g β , defined on all values of a d β ( . ) = [ β , . ] is positive definite:
    g β ( [ β , Z 1 ] , [ β , Z 2 ] ) = Θ ˜ β ( Z 1 , [ β , Z 2 ] )
    g β ( [ β , Z 1 ] , Z 2 ) = Θ ˜ β ( Z 1 , Z 2 )   ,   Z 1 g , Z 2 Im ( a d β ( . ) )
    g β ( Z 1 , Z 2 ) 0   ,   Z 1 , Z 2 Im ( a d β ( . ) )
    where the linear map a d X g l ( g ) is the adjoint representation of the Lie algebra g defined by X , Y g ( = T e G ) a d X ( Y ) = [ X , Y ] , and the co-adjoint representation of the Lie algebra g the linear map a d X * g l ( g * ) which satisfies, for each ξ g * and X , Y g : a d X * ( ξ ) , Y = ξ , a d X ( Y )
    These equations are universal, because they are not dependent on the symplectic manifold but only on the dynamical group G, the symplectic cocycle Θ , the temperature β and the heat Q . Souriau called this model “Lie groups thermodynamics”.
We will give the main theorem of Souriau for this “Lie group thermodynamics”:
Theorem 1 (Souriau Theorem of Lie Group Thermodynamics).
Let Ω be the largest open proper subset of g , Lie algebra of G, such that M e β , U ( ξ ) d λ and M ξ · e β , U ( ξ ) d λ are convergent integrals, this set Ω is convex and is invariant under every transformation A d g ( . ) , where g A d g ( . ) is the adjoint representation of G, such that A d g = T e i g with i g : h g h g 1 . Let a : G × g * g * a unique affine action a such that linear part is a coadjoint representation of G , that is the contragradient of the adjoint representation. It associates to each g G the linear isomorphism A d g * G L ( g * ) , satisfying, for each:
ξ g *   a n d   X g : A d g * ( ξ ) , X = ξ , A d g 1 ( X ) .
Then, the fundamental equations of Lie group thermodynamics are given by the action of the group:
  • Action of Lie group on Lie algebra:
    β A d g ( β )
  • Transformation of characteristic function after action of Lie group:
    Φ Φ θ ( g 1 ) , β
  • Invariance of entropy with respect to action of Lie group:
    s s
  • Action of Lie group on geometric heat, element of dual Lie algebra:
    Q a ( g , Q ) = A d g * ( Q ) + θ ( g )
Souriau equations of Lie group thermodynamics are summarized in the following Figure 5 and Figure 6:
For Hamiltonian, actions of a Lie group on a connected symplectic manifold, the equivariance of the moment map with respect to an affine action of the group on the dual of its Lie algebra has been studied by Marle and Libermann [100] and Lichnerowics [101,102]:
Theorem 2 (Marle Theorem on Cocycles).
Let G be a connected and simply connected Lie group, R : G G L ( E ) be a linear representation of G in a finite-dimensional vector space E, and r : g g l ( E ) be the associated linear representation of its Lie algebra g . For any one-cocycle Θ : g E of the Lie algebra g for the linear representation r, there exists a unique one-cocycle θ : G E of the Lie group G for the linear representation R such that Θ ( X ) = T e θ ( X ( e ) ) , which has Θ as associated Lie algebra one-cocycle. The Lie group one-cocycle θ is a Lie group one-coboundary if and only if the Lie algebra one-cocycle Θ is a Lie algebra one-coboundary.
Let G be a Lie group whose Lie algebra is g . The skew-symmetric bilinear form Θ ˜ on g = T e G can be extended into a closed differential two-form on G , since the identity on Θ ˜ means that its exterior differential d Θ ˜ vanishes. In other words, Θ ˜ is a 2-cocycle for the restriction of the de Rham cohomology of G to left invariant differential forms. In the framework of Lie group action on a symplectic manifold, equivariance of moment could be studied to prove that there is a unique action a(.,.) of the Lie group G on the dual g * of its Lie algebra for which the moment map J is equivariant, that means for each x M :
J ( Φ g ( x ) ) = a ( g , J ( x ) ) = A d g * ( J ( x ) ) + θ ( g )
where Φ : G × M M is an action of Lie group G on differentiable manifold M, the fundamental field associated to an element X of Lie algebra g of group G is the vectors field X M on M:
X M ( x ) = d d t Φ exp ( t X ) ( x ) | t = 0
with Φ g 1 ( Φ g 2 ( x ) ) = Φ g 1 g 2 ( x ) and Φ e ( x ) = x . Φ is Hamiltonian on a symplectic manifold M , if Φ is symplectic and if for all X g , the fundamental field X M is globally Hamiltonian. The cohomology class of the symplectic cocycle θ only depends on the Hamiltonian action Φ , and not on J .
In Appendix B, we observe that Souriau Lie group thermodynamics is compatible with Balian gauge theory of thermodynamics [103], that is obtained by symplectization in dimension 2n + 2 of contact manifold in dimension 2n + 1. All elements of the Souriau geometric temperature vector are multiplied by the same gauge parameter.
We conclude this section by this Bourbakiste citation of Jean-Marie Souriau [34]:
It is obvious that one can only define average values on objects belonging to a vector (or affine) space; Therefore—so this assertion may seem Bourbakist—that we will observe and measure average values only as quantity belonging to a set having physically an affine structure. It is clear that this structure is necessarily unique—if not the average values would not be well defined. (Il est évident que l’on ne peut définir de valeurs moyennes que sur des objets appartenant à un espace vectoriel (ou affine); donc—si bourbakiste que puisse sembler cette affirmation—que l’on n’observera et ne mesurera de valeurs moyennes que sur des grandeurs appartenant à un ensemble possédant physiquement une structure affine. Il est clair que cette structure est nécessairement unique—sinon les valeurs moyennes ne seraient pas bien définies.).

4. The Souriau-Fisher Metric as Geometric Heat Capacity of Lie Group Thermodynamics

We observe that Souriau Riemannian metric, introduced with symplectic cocycle, is a generalization of the Fisher metric, that we call the Souriau-Fisher metric, that preserves the property to be defined as a hessian of the partition function logarithm g β = 2 Φ β 2 = 2 log ψ Ω β 2 as in classical information geometry. We will establish the equality of two terms, between Souriau definition based on Lie group cocycle Θ and parameterized by “geometric heat” Q (element of dual Lie algebra) and “geometric temperature” β (element of Lie algebra) and hessian of characteristic function Φ ( β ) = log ψ Ω ( β ) with respect to the variable β:
g β ( [ β , Z 1 ] , [ β , Z 2 ] ) =   Θ ( Z 1 ) , [ β , Z 2 ] + Q , [ Z 1 , [ β , Z 2 ] ] = 2 log ψ Ω β 2
If we differentiate this relation of Souriau theorem Q ( A d g ( β ) ) = A d g * ( Q ) + θ ( g ) , this relation occurs:
Q β ( [ Z 1 , β ] , . ) = Θ ˜ ( Z 1 , [ β , . ] ) + Q , A d . Z 1 ( [ β , . ] ) = Θ ˜ β ( Z 1 , [ β , . ] )
Q β ( [ Z 1 , β ] , Z 2 . ) = Θ ˜ ( Z 1 , [ β , Z 2 ] ) + Q , A d . Z 1 ( [ β , Z 2 ] ) = Θ ˜ β ( Z 1 , [ β , Z 2 ] )
Q β = g β ( [ β , Z 1 ] , [ β , Z 2 ] )
As the entropy is defined by the Legendre transform of the characteristic function, this Souriau-Fisher metric is also equal to the inverse of the hessian of “geometric entropy” s ( Q ) with respect to the variable Q: 2 s ( Q ) Q 2
For the maximum entropy density (Gibbs density), the following three terms coincide: 2 log ψ Ω β 2 that describes the convexity of the log-likelihood function, I ( β ) = E [ 2 log p β ( ξ ) β 2 ] the Fisher metric that describes the covariance of the log-likelihood gradient, whereas I ( β ) = E [ ( ξ Q ) ( ξ Q ) T ] = V a r ( ξ ) that describes the covariance of the observables.
We can also observe that the Fisher metric I ( β ) = Q β is exactly the Souriau metric defined through symplectic cocycle:
I ( β ) = Θ ˜ β ( Z 1 , [ β , Z 2 ] ) = g β ( [ β , Z 1 ] , [ β , Z 2 ] )
The Fisher metric I ( β ) = 2 Φ ( β ) β 2 = Q β has been considered by Souriau as a generalization ofheat capacity”. Souriau called it K the “geometric capacity”.
For β = 1 k T , K = Q β = Q T ( ( 1 / k T ) T ) 1 = k T 2 Q T linking the geometric capacity to calorific capacity, then Fisher metric can be introduced in Fourier heat equation (see Figure 7):
T t = κ C D Δ T   with   Q T = C D β 1 t = κ [ ( β 2 / k ) · I F i s h e r ( β ) ] 1 Δ β 1
We can also observe that Q is related to the mean, and K to the variance of U:
K = I ( β ) = Q β = var ( U ) = M U ( ξ ) 2 p β ( ξ ) d ω ( M U ( ξ ) p β ( ξ ) d ω ) 2
We observe that the entropy s is unchanged, and Φ is changed but with linear dependence to β , with the consequence that Fisher Souriau metric is invariant:
s [ Q ( A d g ( β ) ) ] = s ( Q ( β ) )   and   I ( A d g ( β ) ) = 2 ( Φ θ ( g 1 ) , β ) β 2 = 2 Φ β 2 = I ( β )
We have observed that the concept of “heat capacity” is important in the Souriau model because it gives a geometric meaning to its definition. The notion of “heat capacity” has been generalized by Pierre Duhem in his general equations of thermodynamics.
Souriau [34] proposed to define a thermometer (θερμός) device principle that could measure this geometric temperature using “relative ideal gas thermometer” based on a theory of dynamical group thermometry and has also recovered the (geometric) Laplace barometric law

5. Euler-Poincaré Equations and Variational Principle of Souriau Lie Group Thermodynamics

When a Lie algebra acts locally transitively on the configuration space of a Lagrangian mechanical system, Henri Poincaré proved that the Euler-Lagrange equations are equivalent to a new system of differential equations defined on the product of the configuration space with the Lie algebra. Marle has written about the Euler-Poincaré equations [104], under an intrinsic form, without any reference to a particular system of local coordinates, proving that they can be conveniently expressed in terms of the Legendre and moment maps of the lift to the cotangent bundle of the Lie algebra action on the configuration space. The Lagrangian is a smooth real valued function L defined on the tangent bundle T M . To each parameterized continuous, piecewise smooth curve γ : [ t 0 , t 1 ] M , defined on a closed interval [ t 0 , t 1 ] , with values in M , one associates the value at γ of the action integral:
I ( γ ) = t 0 t 1 L ( d γ ( t ) d t ) d t
The partial differential of the function L : M × g with respect to its second variable d 2 L ¯ , which plays an important part in the Euler-Poincaré equation, can be expressed in terms of the moment and Legendre maps: d 2 L ¯ = p g * φ t L φ   with J = p g * φ t ( d 2 L ¯ = J L φ ) the moment map, p g * : M × g * g * the canonical projection on the second factor, L : T M T * M the Legendre transform, with:
φ : M × g T M / φ ( x , X ) = X M ( x )   and   φ t : T * M M × g * / φ t ( ξ ) = ( π M ( ξ ) , J ( ξ ) )
The Euler-Poincaré equation can therefore be written under the form:
( d d t a d V ( t ) * ) ( J L φ ( γ ( t ) , V ( t ) ) ) = J d 1 L ¯ ( γ ( t ) , V ( t ) )   with   d γ ( t ) d t = φ ( γ ( t ) , V ( t ) )
with
H ( ξ ) = ξ , L 1 ( ξ ) L ( L 1 ( ξ ) )   ,   ξ T * M   ,   L : T M T * M   ,   H : T * M R .
Following the remark made by Poincaré at the end of his note [105], the most interesting case is when the map L ¯ : M × g R only depends on its second variable X g . The Euler-Poincaré equation becomes:
( d d t a d V ( t ) * ) ( d L ¯ ( V ( t ) ) ) = 0
We can use analogy of structure when the convex Gibbs ensemble is homogeneous [106]. We can then apply Euler-Poincaré equation for Lie group thermodynamics. Considering Clairaut’s equation:
s ( Q ) = β , Q Φ ( β ) = Θ 1 ( Q ) , Q Φ ( Θ 1 ( Q ) )
with Q = Θ ( β ) = Φ β g * , β = Θ 1 ( Q ) g , a Souriau-Euler-Poincaré equation can be elaborated for Souriau Lie group thermodynamics:
d Q d t = a d β * Q
or
d d t ( A d g * Q ) = 0 .
The first equation, the Euler-Poincaré equation is a reduction of Euler-Lagrange equations using symmetries and especially the fact that a group is acting homogeneously on the symplectic manifold:
d Q d t = a d β * Q   and   { s ( Q ) = β , Q Φ ( β ) β = s ( Q ) Q g   ,   Q = Φ ( β ) β g *
Back to Koszul model of information geometry, we can then deduce an equivalent of the Euler-Poincaré equation for statistical models
d x * d t = a d x * x *   and   { Φ * ( x * ) = x , x * Φ ( x ) x = Φ * ( x * ) x Ω   ,   x * = Φ ( x ) x Ω *
We can use this Euler-Poincaré equation to deduce an associated equation on entropy: d s d t = d β d t , Q + β , a d β * Q d Φ d t that reduces to
d s d t = d β d t , Q d Φ d t
due to ξ , a d V X = a d V * ξ , X   β , a d β * Q = Q , a d β β = 0 .
With these new equation of thermodynamics d Q d t = a d β * Q and d d t ( A d g * Q ) = 0 , we can observe that the new important notion is related to co-adjoint orbits, that are associated to a symplectic manifold by Souriau with KKS 2-form.
We will then define the Poincaré-Cartan integral invariant for Lie group thermodynamics. Classically in mechanics, the Pfaffian form ω = p d q H d t is related to Poincaré-Cartan integral invariant [107]. Dedecker has observed, based on the relation [108]:
ω = q ˙ L d q ( q ˙ L q ˙ L ) d t = L d t + q ˙ L ϖ   with   ϖ = d q q ˙ d t
that the property that among all forms χ L d t mod ϖ the form ω = p d q H d t is the only one satisfying d χ 0 mod ϖ , is a particular case of more general Lepage congruence.
Analogies between geometric mechanics and geometric Lie group thermodynamics, provides the following similarities of structures:
{ q ˙ β p Q ,   { L ( q ˙ ) Φ ( β ) H ( p ) s ( Q ) H = p q ˙ L s = Q , β Φ   and   { q ˙ = d q d t = H p β = s Q p = L q ˙ Q = Φ β
We can then consider a similar Poincaré-Cartan-Souriau Pfaffian form:
ω = p d q H d t ω = Q , ( β d t ) s d t = ( Q , β s ) d t = Φ ( β ) d t
This analogy provides an associated Poincaré-Cartan-Souriau integral invariant. Poincaré-Cartan integral invariant C a p d q H . d t = C b p d q H d t is given for Souriau thermodynamics by:
C a Φ ( β ) d t = C b Φ ( β ) d t
We can then deduce an Euler-Poincaré-Souriau variational principle for thermodynamics: The variational principle holds on g , for variations δ β = η ˙ + [ β , η ] , where η ( t ) is an arbitrary path that vanishes at the endpoints, η ( a ) = η ( b ) = 0 :
δ t 0 t 1 Φ ( β ( t ) ) d t = 0

6. Souriau Affine Representation of Lie Group and Lie Algebra and Comparison with the Koszul Affine Representation

This affine representation of Lie group/algebra used by Souriau has been intensively studied by Marle [7,100,109,110]. Souriau called the mechanics deduced from this model, “affine mechanics”. We will explain affine representations and associated notions as cocycles, Souriau moment map and cocycles, equivariance of Souriau moment map, action of Lie group on a symplectic manifold and dual spaces of finite-dimensional Lie algebras. We have observed that these tools have been developed in parallel by Jean-Louis Koszul. We will establish close links and synthetize the comparisons in a table of both approaches.

6.1. Affine Representations and Cocycles

Souriau model of Lie group thermodynamics is linked with affine representation of Lie group and Lie algebra. We will give in the following main elements of this affine representation.
Let G be a Lie group and E a finite-dimensional vector space. A map A : G A f f ( E ) can always be written as:
A ( g ) ( x ) = R ( g ) ( x ) + θ ( g )   with   g G , x E
where the maps R : G G L ( E ) and θ : G E are determined by A. The map A is an affine representation of G in E.
The map θ : G E is a one-cocycle of G with values in E, for the linear representation R; it means that θ is a smooth map which satisfies, for all g , h G :
θ ( g h ) = R ( g ) ( θ ( h ) ) + θ ( g )
The linear representation R is called the linear part of the affine representation A, and θ is called the one-cocycle of G associated to the affine representation A. A one-coboundary of G with values in E, for the linear representation R, is a map θ : G E which can be expressed as:
θ ( g ) = R ( g ) ( c ) c   ,   g G
where c is a fixed element in E and then there exist an element c E such that, for all g G and x E :
A ( g ) ( x ) = R ( g ) ( x + c ) c
Let g be a Lie algebra and E a finite-dimensional vector space. A linear map a : g a f f ( E ) always can be written as:
a ( X ) ( x ) = r ( X ) ( x ) + Θ ( X )   with   X g , x E
where the linear maps r : g g l ( E ) and Θ : g E are determined by a. The map a is an affine representation of G in E. The linear map Θ : g E is a one-cocycle of G with values in E, for the linear representation r; it means that Θ satisfies, for all X , Y g :
Θ ( [ X , Y ] ) = r ( X ) ( Θ ( Y ) ) r ( Y ) ( Θ ( X ) )
Θ is called the one-cocycle of g associated to the affine representation a. A one-coboundary of g with values in E, for the linear representation r, is a linear map Θ : g E which can be expressed as: Θ ( X ) = r ( X ) ( c )   ,   X g where c is a fixed element in E., and then there exist an element c E such that, for all X g and x E :
a ( X ) ( x ) = r ( X ) ( x + c )
Let A : G A f f ( E ) be an affine representation of a Lie group g in a finite-dimensional vector space E, and g be the Lie algebra of G . Let R : G G L ( E ) and θ : G E be, respectively, the linear part and the associated cocycle of the affine representation A. Let a : g a f f ( E ) be the affine representation of the Lie algebra g associated to the affine representation A : G A f f ( E ) of the Lie group G . The linear part of a is the linear representation r : g g l ( E ) associated to the linear representation R : G G L ( E ) , and the associated cocycle Θ : g E is related to the one-cocycle θ : G E by:
Θ ( X ) = T e θ ( X ( e ) ) , X g
This is deduced from:
d A ( exp ( t X ) ) ( x ) d t | t = 0 = d ( R ( exp ( t X ) ) ( x ) + θ ( exp ( t X ) ) d t | t = 0 a ( X ) ( x ) = r ( X ) ( x ) + T e θ ( X )
Let G be a connected and simply connected Lie group, R : G G L ( E ) be a linear representation of G in a finite-dimensional vector space E, and r : g g l ( E ) be the associated linear representation of its Lie algebra g . For any one-cocycle Θ : g E of the Lie algebra g for the linear representation r, there exists a unique one-cocycle θ : G E of the Lie group G for the linear representation R such that:
Θ ( X ) = T e θ ( X ( e ) )
in other words, which has Θ as associated Lie algebra one-cocycle. The Lie group one-cocycle θ is a Lie group one-coboundary if and only if the Lie algebra one-cocycle Θ is a Lie algebra one-coboundary.
d θ ( g exp ( t X ) ) d t | t = 0 = d ( θ ( g ) + R ( g ) ( θ ( exp ( t X ) ) ) d t | t = 0 T g θ ( T L g ( X ) ) = R ( g ) ( Θ ( x ) )
which proves that if it exists, the Lie group one-cocycle θ such that T e θ = Θ is unique.

6.2. Souriau Moment Map and Cocycles

Souriau first introduced the moment map in his book. We will give the link with previous cocycles of affine representation.
There exist J X linear application from g to differential function on M :
g C ( M , R ) X J X
We can then associate a differentiable application J , called moment(um) map for the Hamiltonian Lie group action Φ :
J : M g *       x J ( x )   such   that   J X ( x ) = J ( x ) , X ,   X g
Let J moment map, for each ( X , Y ) g × g , we associate a smooth function Θ ˜ ( X , Y ) : M defined by:
Θ ˜ ( X , Y ) = J [ X , Y ] { J X , J Y }   with   { . , . } : Poisson   Bracket
It is a Casimir of the Poisson algebra C ( M , ) , that satisfies:
Θ ˜ ( [ X , Y ] , Z ) + Θ ˜ ( [ Y , Z ] , X ) + Θ ˜ ( [ Z , X ] , Y ) = 0
When the Poisson manifold is a connected symplectic manifold, the function Θ ˜ ( X , Y ) is constant on M and the map:
Θ ˜ ( X , Y ) : g × g
is a skew-symmetric bilinear form, and is called the symplectic Cocycle of Lie algebra g associated to the moment map J .
Let Θ : g g * be the map such that for all:
X , Y g :   Θ ( X ) , Y = Θ ˜ ( X , Y )
The map Θ is therefore the one-cocycle of the Lie algebra g with values in g * for the coadjoint representation X a d X * of g associated to the affine action of g on its dual:
a Θ ( X ) ( ξ ) = a d X * ( ξ ) + Θ ( X )   ,   X g   ,   ξ g *
Let G be a Lie group whose Lie algebra is g . The skew-symmetric bilinear form Θ ˜ on g = T e G can be extended into a closed differential two-form on G , since the identity on Θ ˜ means that its exterior differential d Θ ˜ vanishes. In other words, Θ ˜ is a 2-cocycle for the restriction of the de Rham cohomology of G to left (or right) invariant differential forms.

6.3. Equivariance of Souriau Moment Map

There exists a unique affine action a such that the linear part is a coadjoint representation:
a : G × g * g * a ( g , ξ ) = A d g 1 * ξ + θ ( g )
with A d g 1 * ξ , X = ξ , A d g 1 X and that induce equivariance of moment J .

6.4. Action of Lie Group on a Symplectic Manifold

Let Φ : G × M M be an action of Lie group G on differentiable manifold M, the fundamental field associated to an element X of Lie algebra g of group G is the vectors field X M on M:
X M ( x ) = d d t Φ exp ( t X ) ( x ) | t = 0   With   Φ g 1 ( Φ g 2 ( x ) ) = Φ g 1 g 2 ( x )   and   Φ e ( x ) = x
Φ is Hamiltonian on a symplectic manifold M , if Φ is symplectic and if for all X g , the fundamental field X M is globally Hamiltonian.
There is a unique action a of the Lie group G on the dual g * of its Lie algebra for which the moment map J is equivariant, that means satisfies for each x M
J ( Φ g ( x ) ) = a ( g , J ( x ) ) = A d g 1 * ( J ( x ) ) + θ ( g )
θ : G g * is called cocycle associated to the differential T e θ of 1-cocyle θ associated to J at neutral element e :
T e θ ( X ) , Y = Θ ˜ ( X , Y ) = J [ X , Y ] { J X , J Y }
If instead of J we take the moment map J ( x ) = J ( x ) + μ   ,   x M , where μ g * is constant, the symplectic cocycle θ is replaced by:
θ ( g ) = θ ( g ) + μ A d g * μ
where θ θ = μ A d g * μ is one-coboundary of G with values in g * .
Therefore, the cohomology class of the symplectic cocycle θ only depends on the Hamiltonian action Φ , not on the choice of its moment map J. We have also:
Θ ˜ ( X , Y ) = Θ ˜ ( X , Y ) + μ , [ X , Y ]
This property is used by Jean-Marie Souriau [10] to offer a very nice cohomological interpretation of the total mass of a classical (nonrelativistic) isolated mechanical system. He [10] proves that the space of all possible motions of the system is a symplectic manifold on which the Galilean group acts by a Hamiltonian action. The dimension of the symplectic cohomology space of the Galilean group (the quotient of the space of symplectic one-cocycles by the space of symplectic one-coboundaries) is equal to 1. The cohomology class of the symplectic cocycle associated to a moment map of the action of the Galilean group on the space of motions of the system is interpreted as the total mass of the system.
For Hamiltonian actions of a Lie group on a connected symplectic manifold, the equivariance of the moment map with respect to an affine action of the group on the dual of its Lie algebra has been proved by Marle [110]. Marle [110] has also developed the notion of symplectic cocycle and has proved that given a Lie algebra symplectic cocycle, there exists on the associated connected and simply connected Lie group a unique corresponding Lie group symplectic cocycle. Marle [104] has also proved that there exists a two-parameter family of deformations of these actions (the Hamiltonian actions of a Lie group on its cotangent bundle obtained by lifting the actions of the group on itself by translations) into a pair of mutually symplectically orthogonal Hamiltonian actions whose moment maps are equivariant with respect to an affine action involving any given Lie group symplectic cocycle. Marle [104] has also explained why a reduction occurs for Euler-Poncaré equation mainly when the Hamiltonian can be expressed as the moment map composed with a smooth function defined on the dual of the Lie algebra; the Euler-Poincaré equation is then equivalent to the Hamilton equation written on the dual of the Lie algebra.

6.5. Dual Spaces of Finite-Dimensional Lie Algebras

Let g be a finite-dimensional Lie algebra, and g * its dual space. The Lie algebra g can be considered as the dual of g * , that means as the space of linear functions on g * , and the bracket of the Lie algebra g is a composition law on this space of linear functions. This composition law can be extended to the space C ( g * , ) by setting:
{ f , g } ( x ) = x , [ d f ( x ) , d g ( x ) ]   ,   f   and   g C ( g * , ) ,   x g *
If we apply this formula for Souriau Lie group thermodynamics, and for entropy s(Q) depending on geometric heat Q:
{ s 1 , s 2 } ( Q ) = Q , [ d s 1 ( Q ) , d s 2 ( Q ) ]   ,   s 1   and   s 2 C ( g * , ) ,   Q g *
This bracket on C ( g * , ) defines a Poisson structure on g * , called its canonical Poisson structure. It implicitly appears in the works of Sophus Lie, and was rediscovered by Alexander Kirillov [111], Bertram Kostant and Jean-Marie Souriau.
The above defined canonical Poisson structure on g * can be modified by means of a symplectic cocycle Θ ˜ by defining the new bracket:
{ f , g } Θ ˜ ( x ) = x , [ d f ( x ) , d g ( x ) ] Θ ˜ ( d f ( x ) , d g ( x ) )
with Θ ˜ a symplectic cocycle of the Lie algebra g being a skew-symmetric bilinear map Θ ˜ : g × g which satisfies:
Θ ˜ ( [ X , Y ] , Z ) + Θ ˜ ( [ Y , Z ] , X ) + Θ ˜ ( [ Z , X ] , Y ) = 0
This Poisson structure is called the modified canonical Poisson structure by means of the symplectic cocycle Θ ˜ . The symplectic leaves of g * equipped with this Poisson structure are the orbits of an affine action whose linear part is the coadjoint action, with an additional term determined by Θ ˜ .

6.6. Koszul Affine Representation of Lie Group and Lie Algebra

Previously, we have developed Souriau’s works on the affine representation of a Lie group used to elaborate the Lie group thermodynamics. We will study here another approach of affine representation of Lie group and Lie algebra introduced by Jean-Louis Koszul. We consolidate the link of Jean-Louis Koszul work with Souriau model. This model uses an affine representation of a Lie group and of a Lie algebra in a finite-dimensional vector space, seen as special examples of actions.
Since the work of Henri Poincare and Elie Cartan, the theory of differential forms has become an essential instrument of modern differential geometry [112,113,114,115] used by Jean-Marie Souriau for identifying the space of motions as a symplectic manifold. However, as said by Paulette Libermann [116], except Henri Poincaré who wrote shortly before his death a report on the work of Elie Cartan during his application for the Sorbonne University, the French mathematicians did not see the importance of Cartan’s breakthroughs. Souriau followed lectures of Elie Cartan in 1945. The second student of Elie Cartan was Jean-Louis Koszul. Koszul introduced the concepts of affine spaces, affine transformations and affine representations [117,118,119,120,121,122,123,124]. More especially, we are interested by Koszul’s definition for affine representations of Lie groups and Lie algebras. Koszul studied symmetric homogeneous spaces and defined relation between invariant flat affine connections to affine representations of Lie algebras, and characterized invariant Hessian metrics by affine representations of Lie algebras [117,118,119,120,121,122,123,124]. Koszul provided correspondence between symmetric homogeneous spaces with invariant Hessian structures by using affine representations of Lie algebras, and proved that a simply connected symmetric homogeneous space with invariant Hessian structure is a direct product of a Euclidean space and a homogeneous self-dual regular convex cone [117,118,119,120,121,122,123,124]. Let G be a connected Lie group and let G/K be a homogeneous space on which G acts effectively, Koszul gave a bijective correspondence between the set of G-invariant flat connections on G/K and the set of a certain class of affine representations of the Lie algebra of G [117,118,119,120,121,122,123,124]. The main theorem of Koszul is: let G/K be a homogeneous space of a connected Lie group G and let g and k be the Lie algebras of G and K, assuming that G/K is endowed with a G-invariant flat connection, then g admits an affine representation (f,q) on the vector space E. Conversely, suppose that G is simply connected and that g is endowed with an affine representation, then G/K admits a G-invariant flat connection.
Koszul has proved the following [117,118,119,120,121,122,123,124]. Let Ω be a convex domain in R n containing no complete straight lines, and an associated convex cone V ( Ω ) = { ( λ x , x ) R n × R / x Ω , λ R + } . Then there exists an affine embedding:
: x Ω [ x 1 ] V ( Ω )
If we consider η the group of homomorphism of A ( n , R ) into G L ( n + 1 , R ) given by:
s A ( n , R ) [ f ( s ) q ( s ) 0 1 ] G L ( n + 1 , R )
and associated affine representation of Lie algebra:
[ f q 0 0 ]
with A ( n , R ) the group of all affine transformations of R n . We have η ( G ( Ω ) ) G ( V ( Ω ) ) and the pair ( η , ) of the homomorphism η : G ( Ω ) G ( V ( Ω ) ) and the map : Ω V ( Ω ) is equivariant.
A Hessian structure (D, g) on a homogeneous space G/K is said to be an invariant Hessian structure if both D and g are G-invariant. A homogeneous space G/K with an invariant Hessian structure (D, g) is called a homogeneous Hessian manifold and is denoted by (G/K, D, g). Another result of Koszul is that a homogeneous self-dual regular convex cone is characterized as a simply connected symmetric homogeneous space admitting an invariant Hessian structure that is defined by the positive definite second Koszul form (we have identified in a previous paper that this second Koszul form is related to the Fisher metric). In parallel, Vinberg [125,126] gave a realization of a homogeneous regular convex domain as a real Siegel domain. Koszul has observed that regular convex cones admit canonical Hessian structures, improving some results of Pyateckii-Shapiro that studied realizations of homogeneous bounded domains by considering Siegel domains in connection with automorphic forms. Koszul defined a characteristic function ψ Ω of a regular convex cone Ω , and showed that ψ Ω = D d log ψ Ω is a Hessian metric on Ω invariant under affine automorphisms of Ω . If Ω is a homogeneous self dual cone, then the gradient mapping is a symmetry with respect to the canonical Hessian metric, and is a symmetric homogeneous Riemannian manifold. More information on Koszul Hessian geometry can be found in [127,128,129,130,131,132,133,134,135,136].
We will now focus our attention to Koszul affine representation of Lie group/algebra. Let G a connex Lie group and E a real or complex vector space of finite dimension, Koszul has introduced an affine representation of G in E such that [117,118,119,120,121,122,123,124]:
E E a s a   s G
is an affine transformation. We set A ( E ) the set of all affine transformations of a vector space E , a Lie group called affine transformation group of E . The set G L ( E ) of all regular linear transformations of E , a subgroup of A ( E ) .
We define a linear representation from G to G L ( E ) :
f : G G L ( E )      s f ( s ) a = s a s o   a E
and an application from G to E :
q : G E      s q ( s ) = s o   s G
Then we have s , t G :
f ( s ) q ( t ) + q ( s ) = q ( s t )
deduced from f ( s ) q ( t ) + q ( s ) = s q ( t ) s o + s o = s q ( t ) = s t o = q ( s t ) .
On the contrary, if an application q from G to E and a linear representation f from G to G L ( E ) verify previous equation, then we can define an affine representation of G in E , written ( f , q ) :
A f f ( s ) : a s a = f ( s ) a + q ( s )   s G , a E
The condition f ( s ) q ( t ) + q ( s ) = q ( s t ) is equivalent to requiring the following mapping to be an homomorphism:
A f f : s G A f f ( s ) A ( E )
We write f the linear representation of Lie algebra g of G , defined by f and q the restriction to g of the differential to q ( f and q the differential of f and q respectively), Koszul has proved that:
f ( X ) q ( Y ) f ( Y ) q ( X ) = q ( [ X , Y ] )   X , Y g with   f : g g l ( E )   and   q : g E
where g l ( E ) the set of all linear endomorphisms of E , the Lie algebra of G L ( E ) .
Using the computation,
q ( A d s Y ) = d q ( s e t Y s 1 ) d t | t = 0 = f ( s ) f ( Y ) q ( s 1 ) + f ( s ) q ( Y )
We can obtain:
q ( [ X , Y ] ) = d q ( A d e t X Y ) d t | t = 0 = f ( X ) q ( Y ) q ( e ) + f ( e ) f ( Y ) ( q ( X ) ) + f ( X ) q ( Y )
where e is the unit element in G . Since f ( e ) is the identity mapping and q ( e ) = 0 , we have the equality: f ( X ) q ( Y ) f ( Y ) q ( X ) = q ( [ X , Y ] )   .
A pair ( f , q ) of a linear representation f of a Lie algebra g on E and a linear mapping q from g to E is an affine representation of g on E , if it satisfies f ( X ) q ( Y ) f ( Y ) q ( X ) = q ( [ X , Y ] )   .
Conversely, if we assume that g admits an affine representation ( f , q ) on E , using an affine coordinate system { x 1 , ... , x n } on E , we can express an affine mapping v f ( X ) v + q ( Y ) by an ( n + 1 ) × ( n + 1 ) matrix representation:
a f f ( X ) = [ f ( X ) q ( X ) 0 0 ]
where f ( X ) is a n × n matrix and q ( X ) is a n row vector.
X a f f ( X ) is an injective Lie algebra homomorphism from g in the Lie algebra of all ( n + 1 ) × ( n + 1 ) matrices, g l ( n + 1 , R ) :
| g g l ( n + 1 , R ) X a f f ( X )
If we denote g a f f = a f f ( g ) , we write G a f f the linear Lie subgroup of G L ( n + 1 , R ) generated by g a f f . An element of s G a f f is expressed by:
A f f ( s ) = [ f ( s ) q ( s ) 0 1 ]
Let M a f f be the orbit of G a f f through the origin o , then M a f f = q ( G a f f ) = G a f f / K a f f where K a f f = { s G a f f / q ( s ) = 0 } = K e r ( q ) .
Example. 
Let Ω be a convex domain in R n containing no complete straight lines, we define a convex cone V ( Ω ) in R n + 1 = R n × R by V ( Ω ) = { ( λ x , x ) R n × R / x Ω , λ R + } . Then there exists an affine embedding:
: x Ω [ x 1 ] V ( Ω )
If we consider η the group of homomorphism of A ( n , R ) into G L ( n + 1 , R ) given by:
s A ( n , R ) [ f ( s ) q ( s ) 0 1 ] G L ( n + 1 , R )
with A ( n , R ) the group of all affine transformations of R n . We have η ( G ( Ω ) ) G ( V ( Ω ) ) and the pair ( η , ) of the homomorphism η : G ( Ω ) G ( V ( Ω ) ) and the map : Ω V ( Ω ) is equivariant:
s = η ( s )   and   d s = η ( s ) d

6.7. Comparison of Koszul and Souriau Affine Representation of Lie Group and Lie Algebra

We will compare, in the following Table 1, affine representation of Lie group and Lie algebra from Souriau and Koszul approaches:

6.8. Additional Elements on Koszul Affine Representation of Lie Group and Lie Algebra

Let { x 1 , x 2 , ... , x n } be a local coordinate system on M, the Christoffel’s symbols Γ i j k of the connection D are defined by:
D x i x j = k = 1 n Γ i j k x k
The torsion tensor T of D is given by:
T ( X , Y ) = D X Y D Y X [ X , Y ]
T ( x i , x j ) = k = 1 n T i j k x k   with   T i j k = Γ i j k Γ j i k
The curvature tensor R of D is given by:
R ( X , Y ) Z = D X D Y Z D Y D X Z D [ X , Y ] Z
R ( x k , x l ) x j = i R j k l i x i   with   R j k l i = Γ l j i x k Γ k j i x l + m ( Γ l j m Γ k m i Γ k j m Γ l m i )
The Ricci tensor Ric of D is given by:
R i c ( Y , Z ) = T r { X R ( X , Y ) Z }
R j k = R i c ( x j , x k ) = i R k i j i
In the following, we will consider a homogeneous space G/K endowed with a G-invariant flat connection D (homogeneous flat manifold) written (G/K, D). Koszul has proved a bijective correspondence between the set of G-invariant flat connections on G/K and the set of affine representations of the Lie algebra of G. Let (G, K) be the pair of connected Lie group G and its closed subgroup K . Let g the Lie algebra of G and k be the Lie subalgebra of g corresponding to K . X * is defined as the vector field on M = G / K induced by the 1-parameter group of transformation e t X . We denote A X * = L X * D X * , with L X * the Lie derivative.
Let V be the tangent space of G / K at o = { K } and let consider, the following values at o :
f ( X ) = A X * , o
q ( X ) = X o *
where A X * Y * = D Y * X * (where D is a locally flat linear connection: its torsion and curvature tensors vanish identically), then:
f ( [ X , Y ] ) = [ f ( X ) , f ( Y ) ]
f ( X ) q ( Y ) f ( Y ) q ( X ) = q ( [ X , Y ] )  
where ker ( k ) = q , and ( f , q ) an affine representation of the Lie algebra g :
X g ,   X a = i ( j f ( X ) i j x i + q ( X ) i ) x i
The 1-parameter transformation group generated by X a is an affine transformation group of V, with linear parts given by e t . f ( X ) and translation vector parts:
n = 1 ( t ) n n ! f ( X ) n 1 q ( X )
These relations are proved by using:
{ A X * Y * A Y * X * = [ X * , Y * ] [ A X * , A Y * ] = A [ X * , Y ] *   with   A X * Y * = D Y * X *
based on the property that the connection D is locally flat and there is local coordinate systems on M such that D x i x j = 0 with a vanishing torsion and curvature:
T ( X , Y ) = 0 D X Y D Y X = [ X , Y ]
R ( X , Y ) Z = 0 D X D Y Z D Y D X Z = D [ X , Y ] Z
deduced from the fact the a locally flat linear connection (vanishing of torsion and curvature).
Let ω be an invariant volume element on G / K in an affine local coordinate system { x 1 , x 2 , ... , x n } in a neighborhood of o :
ω = Φ · d x 1 ... d x n
We can write X * = i χ i x i and develop the Lie derivative of the volume element ω :
L X * ω = ( L X * Φ ) . d x 1 ... d x n + j Φ . d x 1 L X * d x j d x n = ( X * Φ + ( j χ j x j ) Φ ) d x 1 ... d x n
Since the volume element ω is invariant by G:
L X * ω = 0 X * Φ + ( j χ j x j ) Φ = 0 X * log Φ = j χ j   x j
By using A X * Y * = D Y * X * , we have:
( D x i ( A X * ) ) ( x j ) = D x i ( A X * ( x j ) ) A X * ( D x i x j ) = D x i D x j ( k χ k x k ) = k 2 χ k x i x j x k
But as D is locally flat and X * is an infinitesimal affine transformation with respect to D:
D x i ( A X * ) = 0 2 χ k x i x j = 0
The Koszul form and canonical bilinear form are given by:
α = i log Φ x i d x i = D log Φ
D α = i , j 2 log Φ x i x j d x i d x j = D d log Φ
L X * α = L X * D log Φ = D L X * log Φ = D X * log Φ = D ( j χ j x j ) = , j 2 χ j x i x j d x i = 0
Then, L X * α = 0   X g .
By using X * log Φ = j χ j x j , we can obtain:
α ( X * ) = ( D log Φ ) ( X * ) L X * α = 0 D X * log Φ = j χ j x j
By using A X * Y * = D Y * X * , we can develop:
A X * ( x j ) = D x j X * = i χ i x j x i
As f ( X ) = A X * , o and q ( X ) = X o * :
T r ( f ( X ) ) = T r ( A X * , o ) = i χ i x i ( o ) = α ( X 0 * ) = α 0 ( q ( X ) )
If we use that L X * α = 0   X g , then we obtain:
( D α ) ( X * , Y * ) = ( D Y * α ) ( X * ) = ( A Y * α ) ( X * ) = A Y * ( α ( X * ) ) + α ( A Y * X * ) = α ( A Y * X * )
D α 0 ( q ( X ) , q ( Y ) ) = α 0 ( f ( Y ) q ( X ) )
To synthetize the result proved by Jean-Louis Koszul, if α o and D α o are the values of α and D α at o , then:
α o ( q ( X ) ) = T r ( f ( X ) )   X g
D α o ( q ( X ) , q ( Y ) ) = q ( X ) , q ( Y ) o = α 0 ( f ( X ) q ( Y ) )   X , Y g
Jean-Louis Koszul has also proved that the inner product . , . on V, given by the Riemannian metric g i j , satisfies the following conditions:
f ( X ) q ( Y ) , q ( Z ) + q ( Y ) , f ( X ) q ( Z ) = f ( Y ) q ( X ) , q ( Z ) + q ( X ) , f ( Y ) q ( Z )
To make the link with Souriau model of thermodynamics, the first Koszul form α = D log Φ = T r ( f ( X ) ) will play the role of the geometric heat Q and the second koszul form D α = D d log Φ = q ( X ) , q ( Y ) o will be the equivalent of Souriau-Fisher metric that is G-invariant.
Koszul theory is wider and integrates “information geometry” in its corpus. Koszul [117,118,119,120,121,122,123,124] has proved general results, for example: on a complex homogeneous space, an invariant volume defines with the complex structure, an invariant Hermitian form. If this space is a bounded domain, then this hermitian form is positive definite and coincides with the classical Bergman metric of this domain. During his stay at Institute for Advanced Study in Princeton, Koszul [117,118,119,120,121,122,123,124] has also demonstrated the reciprocal for a class of complex homogeneous spaces, defined by open orbits of complex affine transformation groups. Koszul and Vey [137,138] have also developed extended results with the following theorem for connected hessian manifolds:
Theorem 3 (Koszul-Vey Theorem).
Let M be a connected hessian manifold with hessian metric g . Suppose that M admits a closed 1-form α such that D α = g and there exists a group G of affine automorphisms of M preserving α :
  • If M / G is quasi-compact, then the universal covering manifold of M is affinely isomorphic to a convex domain Ω of an affine space not containing any full straight line.
  • If M / G is compact, then Ω is a sharp convex cone.
On this basis, Koszul has given a Lie group construction of a homogeneous cone that has been developed and applied in information geometry by Shima and Boyom in the framework of Hessian geometry. The results of Koszul are also fundamental in the framework of Souriau thermodynamics.

7. Souriau Lie Group Model and Koszul Hessian Geometry Applied in the Context of Information Geometry for Multivariate Gaussian Densities

We will enlighten Souriau model with Koszul hessian geometry applied in information geometry [117,118,119,120,121,122,123,124], recently studied in [3,9,139]. We have previously shown that information geometry could be founded on the notion of Koszul-Vinberg characteristic function ψ Ω ( x ) = Ω * e x , ξ d ξ ,   x Ω where Ω is a convex cone and Ω the dual cone with respect to Cartan-Killing inner product x , y = B ( x , θ ( y ) ) invariant by automorphisms of Ω, with B ( . , . ) the Killing form and θ ( . ) the Cartan involution. We can develop the Koszul characteristic function:
ψ Ω ( x + λ u ) = ψ Ω ( x ) λ x * , u + λ 2 2 K ( x ) u , u + ...
with   x * = d Φ ( x ) d x   ,   Φ ( x ) = log ψ Ω ( x )   and   K ( x ) = d 2 Φ ( x ) d x 2
This characteristic function is at the cornerstone of modern concept of information geometry, defining Koszul density by solution of maximum Koszul-Shannon entropy [140]:
M a x p [ Ω * p ξ ^ ( ξ ) log p ξ ^ ( ξ ) · d ξ ]   such   that Ω * p ξ ^ ( ξ ) d ξ = 1   and   Ω * ξ · p ξ ^ ( ξ ) d ξ = ξ ^
p ξ ^ ( ξ ) = e Θ 1 ( ξ ^ ) , ξ Ω * e Θ 1 ( ξ ^ ) , ξ · d ξ ξ ^ = Θ ( β ) = Φ ( β ) β   where   Φ ( β ) = log ψ Ω ( β ) ψ Ω ( β ) = Ω * e β , ξ d ξ   ,   S ( ξ ^ ) = Ω * p ξ ^ ( ξ ) log p ξ ^ ( ξ ) · d ξ   and   β = Θ 1 ( ξ ^ ) S ( ξ ^ ) = ξ ^ , β Φ ( β )
This last relation is a Legendre transform between the logarithm of characteristic function and the entropy:
log p ξ ^ ( ξ ) = ξ , β + Φ ( β ) S ( ξ ) = Ω * p ξ ^ ( ξ ) · log p ξ ^ ( ξ ) · d ξ = E [ log p ξ ^ ( ξ ) ] S ( ξ ) = E [ ξ ] , β Φ ( β ) = ξ ^ , β Φ ( β )
The inversion Θ 1 ( ξ ^ ) is given by the Legendre transform based on the property that the Koszul-Shannon entropy is given by the Legendre transform of minus the logarithm of the characteristic function:
S ( ξ ^ ) = β , ξ ^ Φ ( β )   with   Φ ( β ) = log Ω * e ξ , β d ξ β Ω   and   ξ , ξ ^ Ω *
We can observe the fundamental property that E [ S ( ξ ) ] = S ( E [ ξ ] )   ,   ξ Ω * , and also as observed by Maurice Fréchet that “distinguished functions” (densities with estimator reaching the Fréchet-Darmois bound) are solutions of the Alexis Clairaut equation introduced by Clairaut in 1734 [141], as illustrated in Figure 8:
S ( ξ ^ ) = Θ 1 ( ξ ^ ) , ξ ^ Φ [ Θ 1 ( ξ ^ ) ] ξ ^ { Θ ( β ) / β Ω }
Details of Fréchet elaboration for this Clairaut(-Legendre) equation for “distinguished function” is given in Appendix A, and other elements are available on Fréchet’s papers [141,142,143,144].
In this structure, the Fisher metric I ( x ) makes appear naturally a Koszul hessian geometry [145,146], if we observe that
log p ξ ^ ( ξ ) = ξ , β + Φ ( β ) S ( ξ ) = Ω * p ξ ^ ( ξ ) · log p ξ ^ ( ξ ) · d ξ = E [ log p ξ ^ ( ξ ) ] S ( ξ ) = E [ ξ ] , β Φ ( β ) = ξ ^ , β Φ ( β )
Then we can recover the relation with Fisher metric:
I ( β ) = E [ 2 log p β ( ξ ) β 2 ] = E [ 2 ( ξ , β + Φ ( β ) ) β 2 ] = 2 Φ ( β ) β 2 ξ ^ = Φ ( β ) β I ( β ) = E [ log p β ( ξ ) β log p β ( ξ ) β T ] = E [ ( ξ ξ ^ ) ( ξ ξ ^ ) T ] = E [ ξ 2 ] E [ ξ ] 2 = V a r ( ξ )
with Crouzeix relation established in 1977 [147,148], 2 Φ β 2 = [ 2 S ξ ^ 2 ] 1 giving the dual metric, in dual space, where entropy S and (minus) logarithm of characteristic function, Φ , are dual potential functions.
The first metric of information geometry [149,150], the Fisher metric is given by the hessian of the characteristic function logarithm:
I ( β ) = E [ 2 log p β ( ξ ) β 2 ] = 2 Φ ( β ) β 2 = 2 log ψ Ω ( β ) β 2
d s g 2 = d β T I ( β ) d β = i j g i j d β i d β j   with   g i j = [ I ( β ) ] i j
The second metric of information geometry is given by hessian of the Shannon entropy:
2 S ( ξ ^ ) ξ ^ 2 = [ 2 Φ ( β ) β 2 ] 1   with   S ( ξ ^ ) = ξ ^ , β Φ ( β )
d s h 2 = d ξ ^ T [ 2 S ( ξ ^ ) ξ ^ 2 ] d ξ ^ = i j h i j d ξ ^ i d ξ ^ j   with   h i j = [ 2 S ( ξ ^ ) ξ ^ 2 ] i j
Both metrics will provide the same distance:
d s g 2 = d s h 2
From the Cartan inner product, we can generate logarithm of the Koszul characteristic function, and its Legendre transform to define Koszul entropy, Koszul density and Koszul metric, as explained in the following Figure 9:
This information geometry has been intensively studied for structured matrices [151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166] and in statistics [167] and is linked to the seminal work of Siegel [168] on symmetric bounded domains.
We can apply this Koszul geometry framework for cones of symmetric positive definite matrices. Let the inner product η , ξ = T r ( η T ξ ) , η , ξ S y m ( n ) given by Cartan-Killing form, Ω be the set of symmetric positive definite matrices is an open convex cone and is self-dual Ω * = Ω .
η , ξ = T r ( η T ξ ) , η , ξ S y m ( n ) ψ Ω ( β ) = Ω * e β , ξ d ξ = det ( β ) n + 1 2 ψ Ω ( I d ) ξ ^ = Φ ( β ) β = ( log ψ Ω ( β ) ) β = n + 1 2 β 1
p ξ ^ ( ξ ) = e Θ 1 ( ξ ^ ) , ξ + Φ ( Θ 1 ( ξ ^ ) ) = ψ Ω ( I d ) · [ det ( α ξ ^ 1 ) ] · e T r ( α ξ ^ 1 ξ ) with   α = n + 1 2
We will in the following illustrate information geometry for multivariate Gaussian density [169]:
p ξ ^ ( ξ ) = 1 ( 2 π ) n / 2 det ( R ) 1 / 2 e 1 2 ( z m ) T R 1 ( z m )
If we develop:
1 2 ( z m ) T R 1 ( z m ) = 1 2 [ z T R 1 z m T R 1 z z T R 1 m + m T R 1 m ] =   1 2 z T R 1 z m T R 1 z + 1 2 m T R 1 m
We can write the density as a Gibbs density:
p ξ ^ ( ξ ) = 1 ( 2 π ) n / 2 det ( R ) 1 / 2 e 1 2 m T R 1 m e [ m T R 1 z + 1 2 z T R 1 z ] = 1 Z e ξ , β ξ = [ z z z T ]   and   β = [ R 1 m 1 2 R 1 ] = [ a H ]   with   ξ , β = a T z + z T H z = T r [ z a T + H T z z T ]
We can then rewrite density with canonical variables:
p ξ ^ ( ξ ) = 1 Ω * e ξ , β . d ξ e ξ , β = 1 Z e ξ , β   with   log ( Z ) = n log ( 2 π ) + 1 2 log det ( R ) + 1 2 m T R 1 m ξ = [ z z z T ]   ,   ξ ^ = [ E [ z ] E [ z z T ] ] = [ m R + m m T ]   ,   β = [ a H ] = [ R 1 m 1 2 R 1 ]   with   ξ , β = T r [ z a T + H T z z T ] R = E [ ( z m ) ( z m ) T ] = E [ z z T m z T z m T + m m T ] = E [ z z T ] m m T
The first potential function (free energy/logarithm of characteristic function) is given by:
ψ Ω ( β ) = Ω * e ξ , β · d ξ   and   Φ ( β ) = log ψ Ω ( β ) = 1 2 [ T r [ H 1 a a T ] + log [ ( 2 ) n det H ] n log ( 2 π ) ]
We verify the relation between the first potential function and moment:
Φ ( β ) β = [ log ψ Ω ( β ) ] β = Ω * ξ e ξ , β Ω * e ξ , β · d ξ · d ξ = Ω * ξ · p ξ ^ ( ξ ) · d ξ = ξ ^ Φ ( β ) β = [ Φ ( β ) a Φ ( β ) H ] = [ m R + m m T ] = ξ ^
The second potential function (Shannon entropy) is given as a Legendre transform of the first one:
S ( ξ ^ ) = ξ ^ , β - Φ ( β )   with   Φ ( β ) β = ξ ^   and   S ( ξ ^ ) ξ ^ = β S ( ξ ^ ) = Ω * e ξ , β Ω * e ξ , β · d ξ log e ξ , β Ω * e ξ , β · d ξ · d ξ = Ω * p ξ ^ ( ξ ) log p ξ ^ ( ξ ) · d ξ
S ( ξ ^ ) = Ω * p ξ ^ ( ξ ) log p ξ ^ ( ξ ) · d ξ = 1 2 [ log ( 2 ) n det [ H 1 ] + n log ( 2 π · e ) ] = 1 2 [ log det [ R ] + n log ( 2 π · e ) ]
This remark was made by Jean-Souriau in his book [10] as soon as 1969. He has observed, as illustrated in Figure 10 that if we take vector with tensor components ξ = ( z z z ) , components of ξ ^ will provide moments of the first and second order of the density of probability p ξ ^ ( ξ ) . He used this change of variable z = H 1 / 2 z + H 1 / 2 a , to compute the logarithm of the characteristic function Φ ( β ) :
We can finally compute the metric from the matrix g i j :
d s 2 = i j g i j d θ i d θ j = d m T R 1 d m + 1 2 T r [ ( R 1 d R ) 2 ]
and from classical expression of the Euler-Lagrange equation:
i = 1 n g i k θ ¨ i + i , j = 1 n Γ i j k θ ˙ i θ ˙ j = 0   ,   k = 1 , ... , n   with   Γ i j k = 1 2 [ g j k θ i + g j k θ j + g i j θ k ]
That is explicitely given by [170]:
{ R ¨ + m ˙ m ˙ T R ˙ R 1 R ˙ = 0 m ¨ R ˙ R 1 m ˙ = 0
We cannot integrate this Euler-Lagrange equation. We will see that Lie group theory will provide new reduced equation, Euler-Poincaré equation, using Souriau theorem.
We make reference to the book of Deza that gives a survey about distance and metric space [171].
The case of Natural Exponential families that are invariant by an affine group has been studied by Casalis (in 1999 paper and in her Ph.D. thesis) [172,173,174,175,176,177,178] and by Letac [179,180,181]. We give the details of Casalis’ development in Appendix C. Barndorff-Nielsen has also studied transformation models for exponential families [182,183,184,185,186]. In this section, we will only consider the case of multivariate Gaussian densities.

8. Affine Group Action for Multivariate Gaussian Densities and Souriau’s Moment Map: Computation of Geodesics by Geodesic Shooting

To more deeply understand Koszul and Souriau Lie group models of information geometry, we will illustrate their tools for multivariate Gaussian densities.
Consider the general linear group G L ( n ) consisting of the invertible n × n matrices, that is a topological group acting linearly on R n by:
G L ( n ) × R n R n ( A , x ) A x
The group G L ( n ) is a Lie group, is a subgroup of the general affine group G A ( n ) , composed of all pairs ( A , υ ) where A G L ( n ) and υ R n , the group operation given by:
( A 1 , υ 1 ) ( A 2 , υ 2 ) = ( A 1 A 2 , A 1 υ 2 + υ 1 )
G L ( n ) is an open subset of R n 2 , and may be considered as n2-dimensional differential manifold with the same differentiable structure than R n 2 . Multiplication and inversion are infinitely often differentiable mappings. Consider the vector space g l ( n ) of real n × n matrices and the commutator product:
g l ( n ) × g l ( n ) g l ( n ) ( A , B ) A B B A = [ A , B ]
This is a Lie product making g l ( n ) into a Lie algebra. The exponential map is then the mapping defined by:
exp : g l ( n ) G L ( n ) A exp ( A ) = n = 0 A n n !
Restricting A to have positive determinant, one obtains the positive general affine group G A + ( n ) that acts transitively on R n by:
( ( A , υ ) , x ) A x + υ
In case of symmetric positive definite matrices S y m + ( n ) , we can use the Cholesky decomposition:
R = L L T
where L is a lower triangular matrix with real and positive diagonal entries, and L T denotes the transpose of L , to define the square root of R .
Given a positive semidefinite matrix R , according to the spectral theorem, the continuous functional calculus can be applied to obtain a matrix R 1 / 2 such that R 1 / 2 is itself positive and R 1 / 2 R 1 / 2 = R . The operator R 1 / 2 is the unique non-negative square root of R .
N n = { ( μ , Σ ) / μ R n , Σ S y m + n } the class of regular multivariate normal distributions, where μ is the mean vector and Σ is the (symmetric positive definite) covariance matrix, is invariant under the transitive action of G A ( n ) . The induced action of G A ( n ) on R n × S y m + n is then given by:
G A ( n ) × ( R n × S y m + n ) R n × S y m + n ( ( A , υ ) , ( μ , Σ ) ) ( A μ + υ , A Σ A T )
and
G A ( n ) × R n R n ( ( A , υ ) , x ) A x + υ
As the isotropy group of ( 0 , I n ) is equal to O ( n ) , we can observe that:
N n = G A ( n ) / O ( n )
N n is an open subset of the vector space T n = { ( η , Ω ) / η R n , Ω S y m n } and is a differentiable manifold, where the tangent space at any point may be identified with T n .
The Fisher information defines a metric given to N n a Riemannian manifold structure. The inner product of two tangent vectors ( η 1 , Ω 1 ) T n , ( η 2 , Ω 2 ) T n at the point ( μ , Σ ) N n is given by:
g ( μ , Σ ) ) ( ( η 1 , Ω 1 ) , ( η 1 , Ω 1 ) ) = η 1 T Σ 1 η 2 + 1 2 T r ( Σ 1 Ω 1 Σ 1 Ω 2 )
Niels Christian Bang Jesperson has proved that the transformation model on R n with parameter set R n × S y m + n are exactly those of the form p μ , Σ = f μ , Σ λ where λ is the Lebesque measure, where f μ , Σ ( x ) = h ( ( x μ ) T Σ 1 ( x μ ) ) / det ( Σ ) 1 / 2 and h : [ 0 , + [ R + is a continuous function with 0 + h ( s ) s n 2 1 d s < + . Distributions with densities of this form are called elliptic distributions.
To improve understanding of tools, we will consider G A ( n ) as a sub-group of affine group, that could be defined by a matrix Lie group G a f f , that acts for multivariate Gaussian laws, as illustrated in Figure 11:
[ Y 1 ] = [ R 1 / 2 m 0 1 ] [ X 1 ] = [ R 1 / 2 X + m 1 ]   ,   { ( m , R ) R n × S y m + ( n ) M = [ R 1 / 2 m 0 1 ] G a f f X ( 0 , I ) Y ( m , R )
We can verify that M is a Lie group with classical properties, that product of M preserves the structure, the associativity, the non-commutativity, and the existence of neutral element:
M 1 · M 2 = [ R 1 1 / 2 m 1 0 1 ] [ R 2 1 / 2 m 2 0 1 ] = [ R 1 1 / 2 R 2 1 / 2 R 1 1 / 2 m 2 + m 1 0 1 ] M 2 · M 1 = [ R 2 1 / 2 m 2 0 1 ] [ R 1 1 / 2 m 1 0 1 ] = [ R 2 1 / 2 R 1 1 / 2 R 2 1 / 2 m 1 + m 2 0 1 ] } { M 1 · M 2 G a f f M 2 · M 1 G a f f M 1 · M 2 M 2 · M 1 M 1 · ( M 2 · M 3 ) = ( M 1 · M 2 ) · M 3 M 1 · I = M 1
We can also observe that the inverse preserves the structure:
M = [ R 1 / 2 m 0 1 ] M R 1 = M L 1 = M 1 = [ R