Lie Group Statistics and Lie Group Machine Learning Based on Souriau Lie Groups Thermodynamics & Koszul-Souriau-Fisher Metric: New Entropy Definition as Generalized Casimir Invariant Function in Coadjoint Representation

Barbaresco, Frédéric

doi:10.3390/e22060642

Open AccessArticle

Lie Group Statistics and Lie Group Machine Learning Based on Souriau Lie Groups Thermodynamics & Koszul-Souriau-Fisher Metric: New Entropy Definition as Generalized Casimir Invariant Function in Coadjoint Representation

by

Frédéric Barbaresco

Key Technology Domain PCC (Processing, Control & Cognition) Representative, Thales Land & Air Systems, Voie Pierre-Gilles de Gennes, F91470 Limours, France

Entropy 2020, 22(6), 642; https://doi.org/10.3390/e22060642

Submission received: 4 March 2020 / Revised: 31 May 2020 / Accepted: 2 June 2020 / Published: 9 June 2020

(This article belongs to the Special Issue Lie Group Machine Learning and Lie Group Structure Preserving Integrators)

Download

Browse Figures

Versions Notes

Abstract

In 1969, Jean-Marie Souriau introduced a “Lie Groups Thermodynamics” in Statistical Mechanics in the framework of Geometric Mechanics. This Souriau’s model considers the statistical mechanics of dynamic systems in their “space of evolution” associated to a homogeneous symplectic manifold by a Lagrange 2-form, and defines in case of non null cohomology (non equivariance of the coadjoint action on the moment map with appearance of an additional cocyle) a Gibbs density (of maximum entropy) that is covariant under the action of dynamic groups of physics (e.g., Galileo’s group in classical physics). Souriau Lie Group Thermodynamics was also addressed 30 years after Souriau by R.F. Streater in the framework of Quantum Physics by Information Geometry for some Lie algebras, but only in the case of null cohomology. Souriau method could then be applied on Lie groups to define a covariant maximum entropy density by Kirillov representation theory. We will illustrate this method for homogeneous Siegel domains and more especially for Poincaré unit disk by considering SU(1,1) group coadjoint orbit and by using its Souriau’s moment map. For this case, the coadjoint action on moment map is equivariant. For non-null cohomology, we give the case of Lie group SE(2). Finally, we will propose a new geometric definition of Entropy that could be built as a generalized Casimir invariant function in coadjoint representation, and Massieu characteristic function, dual of Entropy by Legendre transform, as a generalized Casimir invariant function in adjoint representation, where Souriau cocycle is a measure of the lack of equivariance of the moment mapping.

Keywords:

Lie groups thermodynamics; Lie group machine learning; Kirillov representation theory; coadjoint orbits; moment map; covariant Gibbs density; maximum entropy density; Souriau-Fisher metric; generalized Casimir invariant function

La thèse de Kirillov, parue en 1962, a suscité immédiatement beaucoup d’intérêt…En outre, quantité de notions naturelles concernant les représentations s’interprètent géométriquement en terme d’orbites coadjointes: restriction à un sous-groupe, induction unitaire, produit tensoriel, mesure de Plancherel, la topologie de l’ensemble représentations unitaires irréductibles… Kirillov s’est vite convaincu, et il a convaincu la communauté mathématique que cette « méthode des orbites » devait être applicable à des groupes bien plus généraux que les groupes nilpotents. Il n’a pas hésité à aborder le cas des groupes de Lie connexes quelconques. Evidemment, des difficultés considérables ont surgi immédiatement. Néanmoins, Kirillov a indiqué une voie d’accès, qui ensuite a été largement utilisée.
- Jacques Dixmier, Brèves remarques sur l’œuvre de A.A. Kirillov

On comprend ainsi comment Lagrange a pu développer les lois de la Mécanique des systèmes formés de solides sans s’occuper des variations de la température de ces corps et Fourier traiter des variations de la température de ces mêmes corps solides sans s’occuper de leur mouvement; comment on peut étudier le mouvement de la Terre, assimilée à un solide rigide, sans se préoccuper de la température de cet astre et étudier le refroidissement du globe terrestre sans se préoccuper de son mouvement. Une telle indépendance entre les problèmes qui ressortissent à la Mécanique et les problèmes qui ressortissent à la Théorie de la chaleur n’existe plus lorsque les systèmes auxquels on a affaire ne sont plus des systèmes classiques; si, par exemple, au lieu de regarder la Terre comme un solide rigide, d’état invariable, on tient compte des changements de volume, de forme, d’état physique et chimique qui accompagnent son refroidissement, on ne peut plus séparer le problème du mouvement de la Terre et le problème du refroidissement terrestre. … On sait que cette forme de relations supplémentaires avait été introduite par Newton et les géomètres du XVIIIème siècle dans la théorie du son. Ces considérations montrent que les questions qui ressortissent à la Thermodynamique ont dû solliciter l’attention des physiciens dès qu’on a voulu aborder l’étude des systèmes autres que des systèmes classiques; et, en fait, c’est la théorie de la propagation du son dans l’air qui a provoqué Laplace à créer la Thermodynamique.
- P. Duhem, L’intégrale des forces vives en thermodynamique, JMPA 4:5-19, 1898 [1,2,3,4]

Sous cette aspiration, la physique qui était d’abord une science des “agents” doit devenir une science des “milieux”. C’est en s’adressant à des milieux nouveaux que l’on peut espérer pousser la diversification et l’analyse des phénomènes jusqu’à en provoquer la géométrisation fine et complexe, vraiment intrinsèque…Sans doute, la réalité ne nous a pas encore livré tous ses modèles, mais nous savons déjà qu’elle ne peut en posséder un plus grand nombre que celui qui lui est assigné par la théorie mathématique des groupes
- Gaston Bachelard, Etude sur l’Evolution d’un problème de Physique –La propagation thermique dans les solides, 1928

1. Introduction

The previous French quotes by the Mathematician Jacques Dixmier, the Physicist Pierre Duhem, and the Philosopher Gaston Bachelard are important to introduce the epistemological context of models that will be developed in the paper. Jacques Diximer refers to Alexander Kirillov seminal idea of coadjoint orbits method to consider Lie group representation model. Pierre Duhem makes comments to the origin of the gap between the theory of heat and the theory of Mechanics. Finally, Gaston Bachelard make prediction that new Thermodynamics foundations will be given by groups. We will try in this paper, to prove that these ideas could be reconciled by the Souriau model of Lie groups Thermodynamics through the mathematical structure of Lie algebra cohomology.

After a the state of the art and trends in Machine Learning based on Information Geometry, we will present, in this introduction, the main objective of this paper to jointly apply models from geometric statistical mechanics and tools from Information geometry to solve “Gauss density” definition problem for statistics on Lie groups and homogeneous manifolds. We will also present use-cases motivation for Lie group machine learning illustrating for Doppler statistics analysis with SU(1,1) statistics, and for kinematics data analysis with SE(2) statistics.

1.1. State of the Art and Trends in Machine Learning Based on Information Geometry

The classical simple gradient descent used in Deep Learning has two drawbacks: the use of the same non-adaptive learning rate for all parameter components, and a non-invariance with respect to parameter re-encoding inducing different learning rates. As the parameter space of multilayer networks forms a Riemannian space equipped with Fisher information metric, instead of the usual gradient descent method, the natural gradient or Riemannian gradient method, which takes account of the geometric structure of the Riemannian space, is more effective for learning. The natural gradient preserves this invariance to be insensitive to the characteristic scale of each parameter direction. The Fisher metric defines a Riemannian metric as the Hessian of two dual potential functions (the Entropy and the log-partition function). Yann Ollivier and Gaétan Marceau-Caron provided in 2016 [5] the first experimental results on non-synthetic data sets for the quasi-diagonal Riemannian Natural gradient descents for neural networks introduced previously by Yann Ollivier in [6] (MNIST, SVHN, and FACE data sets). The quasi-diagonal Riemannian algorithms consistently beat simple stochastic gradient gradient descents by a varying margin. The computational overhead with respect to simple backpropagation is around a factor 2, and reach their final performance quickly, thus requiring fewer training epochs and a smaller total computation time. The main goal of natural gradient is to obtain invariance properties, such as, for a neural network, insensitivity of the training algorithm to whether a logistic or tanh activation function is used, or insensitivity to simple changes of variables in the parameters, such as scaling some parameters. In 2017, same authors have introduced the resulting natural Langevin dynamics [7] combining the advantages of natural gradient descent and Fisher-preconditioned Langevin dynamics for large neural networks, validated on MNIST with Fisher matrix preconditioning. With all invariance properties of natural gradient, this Langevin Dynamics avoids overfitting as a regularization method, and replaces classical methods based on a controlled amount of noise to stochastic gradient descents, that ensures convergence to the Bayesian posterior on model parameters. The theoretically optimal covariance of the noise is the inverse Fisher metric, and Y. Ollivier and G. Marceau-Caron have shown how to implement this in practice with neural networks using efficient Fisher metric approximations. In 2017, Yann Ollivier has also introduced TANGO algorithm (True Asymptotic Natural Gradient Optimization) [8], which converges to a true natural gradient descent in the limit of small learning rates, without explicit Fisher matrix estimation, and where in large dimension, small learning rates will be required to approximate the natural gradient well. Y. Ollivier has also shown that it is possible to get arbitrarily close to exact natural gradient descent with a lightweight algorithm. About natural gradient for Deep Learning, we can refer to [9,10]. This year, Shun-ichi Amari [11] has given an elementary geometrical proof that any target function is realized in a sufficiently small neighborhood of any randomly connected deep network, provided the width (the number of neurons in a layer) is sufficiently large.

In this paper, we will introduce how to extend these approaches for data as elements of Lie groups or data lying on a homogeneous manifold where a Lie group acts transitively. This extension is considered in the framework and interconnexion of Souriau “Lie groups Thermodynamics”, Information Geometry and Kirillov representation theory [12] to define probability densities for Lie groups, as Souriau covariant Gibbs densities (density of Maximum of Entropy). We will develop this case for the matrix Lie group SU(1,1) (case with null cohomology) through the computation of Souriau’s moment map, and Kirillov’s orbit method. We will also develop the method for SE(2) Lie group (case with non-null cohomology) where a Souriau cocycle should be taken into account due to the defect of equivariance of the coadjoint action on the moment map.

Supervised learning approaches are based on neural networks whose parameters are estimated by natural gradient algorithms. Non-supervised algorithm are based on clustering by using technics called “k-means” or “Mean-shift” using distance between elements of the dataset. In both cases, if we want to extend these approaches for Lie groups dataset, we have to extend the notion of Gaussian densities and distance between elements. We propose to use Geometric Statistical Model coming from Geometric Statistical Mechanics to introduce “Gauss density” of Lie group elements. Jointly, we can associate a natural distance between these Lie group elements on the Symplectic manifold by means of KKS 2-form, introducing a natural Riemannian metric associated to Fisher Metric from Information Geometry. The objective of this paper is to explain how to use Geometric Statistical Mechanics tools in this context.

1.2. Objectives of this Paper

The purpose of this article is multiple. The work of Professor Jean-Marie Souriau is well known in the field of “Geometric Mechanics” of which he is one of the founders with his book “structure of dynamic systems” published in 1969, and in which he introduced the foundations of Symplectic Geometry. Inside this book, chapter IV dealing with the extension of Geometric Mechanics to Statistical Mechanics, has been little read or misunderstood by this community. We have discovered that this model was part of and generalized another discipline, which is called Information Geometry. We have demonstrated in other previous articles that one could generalize Fisher metric (invariant metric used in Information Geometry) for Lie groups, with this model. It is therefore a question of rehabilitating the work of Jean-Marie Souriau in a broader framework, which concerns statistics and machine learning extended to objects considered as elements of a Lie group or a homogeneous manifold.

The second goal is to solve with these new tools problems that were still unsolved in statistics and machine learning. These unresolved problems concern the definition and calculation of the expression of probability densities, playing the role of Gaussian density, for elements of a Lie group or elements of homogeneous manifolds. In this article, we completely solve the problem for 2 Lie groups very useful in machine learning but also in physics, the Lie groups SU(1,1) and SE(2). The calculation is not a simple application of the Souriau model, because it is necessary to establish the “moment map” associated with these groups and define a Laplace transform on their coadjoint orbits of these groups (action of the group on the dual space of Lie algebra). In a second step, we must use Information Geometry to write these covariant Gibbs densities in the correct parametrization which parametrizes the generalized Gaussian law from statistical moments on the homogeneous symplectic manifold associated with coadjoint orbits. In the case SU(1,1), which corresponds to a case of null cohomology (equivariance of the coadjoint operator on the moment application), as the homogeneous symplectic manifold to the coadjoint orbit is the Poincaré unit disk, we solve jointly, an open problem to define mathematically the notion of Gaussian density in this disk in hyperbolic geometry. With the property that this density is by construction invariant under the action of the group SU(1,1), which is the condition sine qua none to preserve the symmetries and the invariance of the associated Fisher metric. We show that this model achieves a breakthrough in machine learning, because we have a Gibbs density and a Fisher metric invariant by change of parametrization and invariant under the action of symmetries. Gibbs density allows us to extend the classical supervised statistical machine learning algorithms, and Fisher metric allows us to adress unsupervised learning problem as k-means problems in metric space. The model opens the way to machine learning for Lie groups with multiple applications in robotics, sensor signal processing, image processing.

In the last part of the article, based on this model, we give a new “geometric” definition of Entropy by showing that Entropy is an invariant Casimir function in coadjoint representation. The Casimir functions have been widely studied within the framework of Poisson structures and manifolds [13,14,15,16]. This characterization of Entropy is new, because previously Entropy was defined axiomatically. Using this Casimir function property, we show that it is possible to use full geometric approaches to construct the Entropy function only from the structure coefficients of the Lie group associated with the symmetries involved. We show that we can also introduce an Euler-Poincaré equation and its stochastic variant to study other open problems in statistics and thermodynamics. The application of this Casimir characterization, which is demonstrated in this article, are developed in another twin article published in the same special issue with François Gay-Balmaz [17].

1.3. Motivation of Lie Group Machine Learning with Use-Cases

Machine learning is a field of study of artificial intelligence, which is based on statistical approaches to give computers the ability to “learn” from data, that is, to classify data from observations in a supervised or non-supervised way. Machine learning generally has two phases. The first consists in estimating a model from data, called observations. This so-called “training” phase is generally carried out before the practical use of the model. The second phase corresponds to the start of production: the model being determined, new data can then be classified. According to the information available during the learning phase, learning is qualified in different ways. If the data is labeled (that is, the task response is known for that data), it is supervised learning. We speak of classification if the labels are discrete, or of regression if they are continuous. In the most general case, without a label, we seek to determine the underlying structure of the data (which can be a probability density) and it is then unsupervised learning. Machine learning can be applied to different types of data, such as graphs, trees, curves, or more simply feature vectors, which can be continuous or discrete. We propose to extend the approach, when datasets are element of matrix lie groups.

Learning algorithms can be categorized according to their learning mode. For supervised learning, the classes are predetermined and the examples known, and then the system learns to classify according to a classification model. An expert must label examples beforehand. The process takes place in two phases. For unsupervised learning, when the system or operator has only examples, but no label, and the number of classes and their nature have not been predetermined, we speak of unsupervised learning or clustering. No expert is required. The algorithm must discover for itself the more or less hidden structure of the data, by data partitioning and data clustering. The system must cluster the data according to their available attributes, to classify them into homogeneous groups of examples. Similarity is generally calculated according to a distance function between pairs of examples.

We will illustrate two problems of Machine Learning on Lie groups coming from Radar Industry. Target recognition on Radar micro-Doppler data could be modeled by a problem of classification of dataset considered as elements of SU(1,1) Lie group (see Figure 1). Radar complex time series of micro-Doppler observation of data are classically processed on sliding time window to estimate their associated covariance matrices that are characterized by a Toeplitz Hermitian Positive-definitiveness structure. Using a well-known Verblunsky/Trench Theorem, we can parametrize all Toeplitz Hermitian Positive Definite Covariance matrices of stationary Radar Time series in a product space with a real positive axis (for signal power) and a Poincaré polydisk (for Doppler Spectrum shape). If we consider the Poincaré Unit Disk as an homogeneous space where SU(1,1) Lie group acts transitively. Each data in Poincaré unit disk of this polydisk could be then coded by SU(1,1) matrix Lie group element. We have transformed the problem into a statistical learning challenge processing data of SU(1,1) matrix Lie group. Another exemple considers flying object recognition on their kinematics coded in SE(2) or SE(3) Lie Groups. 3D (or 2D) trajectories could be coded by SE(3) (or SE(2)) Lie group time series provided through Invariant Extended Kalman Filter (IEKF) Radar Tracker, that locally estimates displacement of Frenet-Seret frame. Object kinematics will be then coded by time series of SE(3) (or SE(2)) matrix Lie groups characterizing local rotation/translation of Frenet frame along the drone 3D (or 2D) trajectory. Statistics of this SE(3) (or SE(2)) Lie group elements will characterize flight mechanics of different kinds of object (birds, drones, …).

SU(1,1) or SE(2) are also fundamental tools in Image Processing (Sub-Riemannian Geometry of vision with SE(2)), in robotics (rigid bodies statistical analysis with SE(2)), in Natural Langage Processing (methods of graph-embedding in Poincaré disk with SU(1,1)), …. For instance, SU(1,1) Lie group which acts on Poincaré unit Disk is highly studied to embed isometrically a graph in an hyperbolic space. It is used by GAFAM (Google, Facebook, …) for Natural Language Processing by reducing graph analysis to a Machine Learning problem in Hyperbolic Poincaré Unit Disk. Hyperbolic Neural network [18] have been developed in this framework. SU(1,1) Lie group is also fundamental in Quantum physics to describe Coherent states of an electron in a magnetic field for instance [19] and Coherent states in Quantum Optics [20] (some statistical photon-counting aspects of SU(1,1) coherent states are emphasized). SE(2) Lie group is especially fundamental for Geometry of Vision considering sub-Riemannian approaches of the Citti-Petitot-Sarti Model [21] but also also in neuroimagery [22].

1.3.1. SU(1,1) Lie Group Machine Learning for Doppler Data Statistics Analysis

Lie group structure appears naturally on Doppler data, if we consider time series of locally stationary signal and their associated covariance matrix. Covariance matrix is Toeplitz Hermitian Positive Definite. Based on Theorem due to Verblunsky [23,24] and Trench [25], we can parametrize Hermitian Positive Definite Matrix in product space involving the Poincaré unit Polydisk:

\begin{array}{l} φ : T H D P (n) \to R_{+}^{*} \times D^{n - 1} \\ R_{n} \mapsto (P_{0}, μ_{1}, \dots, μ_{n - 1}) \end{array}

(1)

where D is the Poincaré Unit Disk:

D = {z = x + i y \in C / | z | < 1}

(2)

The Poincaré unit disk is an homogeneous bounded domain where the Lie group SU(1,1) act transitively. This Matrix Group is given by:

S U (1, 1) = {[\begin{matrix} a & b \\ b^{*} & a^{*} \end{matrix}] / {| a |}^{2} - {| b |}^{2} = 1, a, b \in C}

(3)

where SU(1,1) acts on the Poincaré Unit Disk by:

g \in S U (1, 1) \Rightarrow g . z = \frac{a z + b}{b^{*} z + a}

(4)

with Cartan Decomposition of SU(1,1)

\begin{array}{l} (\begin{matrix} a & b \\ b^{*} & a^{*} \end{matrix}) = | a | (\begin{matrix} 1 & z \\ z^{*} & 1 \end{matrix}) (\begin{matrix} a / | a | & 0 \\ 0 & a^{*} / | a | \end{matrix}) \\ with z = b {(a^{*})}^{- 1}, | a | = {(1 - {| z |}^{2})}^{- 1 / 2} \end{array}

(5)

We can observe that

z = b {(a^{*})}^{- 1}

could be considered as action of

g \in S U (1, 1)

on the centre on the unit disk

z = g . 0 = b {(a^{*})}^{- 1}

. The principal idea is that we can code any point

z = b {(a^{*})}^{- 1}

in the unit disk by an element of the Lie group SU(1,1). Main advantage is that the point position is no longer coded by coordinates but intrinsically by transformation from the orogin 0 to this point. Finally, a covariance matrix of a stationary signal could be coded by (n−1) Matrix SU(1,1) Lie group elements:

\begin{array}{l} THPD \to R_{+}^{*} \times D^{n - 1} \to R_{+}^{*} \times S U {(1, 1)}^{n - 1} \\ R_{n} \mapsto (P_{0}, μ_{1}, \dots, μ_{n - 1}) \mapsto (P_{0}, [\begin{matrix} a_{1} & b_{1} \\ b_{1}^{*} & a_{1}^{*} \end{matrix}], \dots, [\begin{matrix} a_{n - 1} & b_{n - 1} \\ b_{n - 1}^{*} & a_{n - 1}^{*} \end{matrix}]) \end{array}

(6)

1.3.2. SE(2) and SE(3) Lie Groups Machine Learning for Kinematics Data Statistics Analysis

When we consider a 3D trajectory of a mobile target, we can describe this curve by a time evolution of the local Frenet-Serret frame (local frame with tangent vector, normal vector and binormal vector) as illustrated in Figure 2. This frame evolution is described by the Frenet-Serret formula that gives the kinematic properties of the target moving along the continuous, differentiable curve in 3D Euclidean space ℝ³. More specifically, the formulas describe the derivatives of the so-called tangent, normal, and binormal unit vectors in terms of each other.

\frac{d}{d t} (\begin{matrix} \vec{t} \\ \vec{n} \\ \vec{b} \end{matrix}) = [\begin{matrix} 0 & κ & 0 \\ - κ & 0 & γ \\ 0 & - γ & 0 \end{matrix}] (\begin{matrix} \vec{t} \\ \vec{n} \\ \vec{b} \end{matrix}) with {\begin{cases} κ : curvature \\ γ : torsion \end{cases}

(7)

We will consider motions determined by exponentials of paths in the Lie algebra. Such a motion is determined by a unit speed space-curve

τ (t)

. Now in a Frenet-Serret motion a point in the moving body moves along the curve and the coordinate frame in the moving body remains aligned with the tangent

\vec{t}

, normal

\vec{n}

, and binormal

\vec{b}

, of the curve. Using the 4-dimensional representation of the Lie group SE(3), the motion can be specified as:

G (t) = (\begin{matrix} R (t) & τ (t) \\ 0 & 1 \end{matrix}) \in S E (3)

(8)

where

τ (t)

is the curve and the rotation matrix

R (t)

has the unit vectors

\vec{t}

,

\vec{n}

, and

\vec{b}

as columns:

R (t) = (\begin{matrix} \vec{t} & \vec{n} & \vec{b} \end{matrix}) \in S O (3)

(9)

If we introduce the Darboux vector

\vec{ω} = γ \vec{t} + κ \vec{b}

that we can rewritte from Frenet-Serret Formulas:

\frac{d \vec{t}}{d t} = \vec{ω} \times \vec{t}, \frac{d \vec{n}}{d t} = \vec{ω} \times \vec{n}, \frac{d \vec{b}}{d t} = \vec{ω} \times \vec{b}

(10)

Then, we can write with Ω is the 3 × 3 anti-symmetric matrix corresponding to

\vec{ω}

:

\frac{d R}{d t} = Ω R

(11)

We note that

\frac{d τ (t)}{d t} = \vec{t}

and

\frac{d \vec{ω}}{d t} = \frac{d γ}{d t} \vec{t} + \frac{d κ}{d t} \vec{b}

.

The instantaneous twist of the motion

G (t)

is given by:

S_{d} = \frac{d G (t)}{d t} G^{- 1} (t) = (\begin{matrix} Ω & υ \\ 0 & 0 \end{matrix})

(12)

This is the Lie algebra element corresponding to the tangent vector to the curve

G (t)

. It is well known that elements of the Lie algebra

s e (3)

can be described as lines with a pitch. The fixed axode of a motion

G (t) \in S E (3)

is given by the axis of

S_{d}

as t varies. The instantaneous twist in the moving reference frame is given by

S_{b} = G^{- 1} (t) S_{d} G (t)

, that is, by the adjoint action on the twist in the fixed frame. The instantaneous twist

S_{b}

can also be found from the relation:

S_{b} = G^{- 1} (t) \frac{d G (t)}{d t}

(13)

S_{b} = G^{- 1} \frac{d G}{d t} = (\begin{matrix} R^{T} & - R^{T} τ \\ 0 & 1 \end{matrix}) (\begin{matrix} Ω R & \vec{t} \\ 0 & 0 \end{matrix}) = (\begin{matrix} R^{T} Ω R & R \vec{t} \\ 0 & 0 \end{matrix})

(14)

We can observe that we could describe a 3D trajectory by a time series of SE(3) Lie group elements:

S E (3) = {[\begin{matrix} R & τ \\ 0 & 1 \end{matrix}] / R \in S O (3), τ \in R^{3}}

(15)

with

S O (3) = {R / R^{T} R = R R^{T} = I, \det^{2} R = 1}

(16)

Then, the trajectory will be given by the following time series of SE(3) elements:

{[\begin{matrix} R_{1} & τ_{1} \\ 0 & 1 \end{matrix}], [\begin{matrix} R_{2} & τ_{2} \\ 0 & 1 \end{matrix}], \dots, [\begin{matrix} R_{n} & τ_{n} \\ 0 & 1 \end{matrix}]} \in S E {(3)}^{n}

(17)

2. New Results Introduced in the Paper

The paper is structured in two parts:

-: 1st Part on “Gauss Density on Lie groups”: This part is totally new in Machine learning with an extension of “Gauss densities” (defined as Maximum Entropy model) for Lie groups coupling both Souriau model (introduced in statistical physics domain), with Information Geometry in Geometric Machine Learning domain. We illustrate with two use-cases SU(1,1) and SE(2) that are the most useful Lie groups in Image Processing (Sub-Riemannian Geometry of vision with SE(2)), in robotics (rigid bodies statistical analysis with SE(2)), in Natural Langage Processing (SU(1,1) with methods of graph-embedding in Poincaré disk), …. Some tentatives have been developed to define noise on Lie groups by adding additional Gaussian components on elements of the Lie algebra [26,27,28,29], but these models are not mathematically correct because they do not preserve the symmetries and the moment map associated to these symmetries by the Noether Theorem.
-: 2nd part on “Entropy definition extension as Casimir Function”: This part gives a new geometric definition of Entropy as invariant Casimir function in coadjoint representation, explaining the invariance of entropy under the affine coadjoint action on moment map in the dual space of Lie algebra. This definition was not in the paper of Souriau. With this new definition, we can compute Entropy only by structure constraints given by the Lie group. It opens the door to new generalization of Maximum Entropy method and first of all computation of “Gaussian densities” for any Lie group. Applications of this new property is not developed in this paper but in a twin paper in the same special issue [17]. We refert to M. Gromov papers to consider more geometric structures of Entropy [30,31].

The main new results of this paper are the introduction of “Gauss density” for Lie groups or data on homogeneous space where a Lie groups acts transitively, and the full computation for SU(1,1) Lie group. This group acts transitively on the Poincaré unit disk, and so we have also solved an open problem related to Gauss density on this homogeneous space. For this purpose, the main approach has considered an extended definition of classical “Gauss density”, as introduced by Jaynes, in term of density of Maximum Entropy. In this way, the initial problem was transfert to a new one related to the good definition of Entropy for Lie groups. To address this problem, first, we have recalled the classical Euclidean case, where the Entropy

S (η)

could be defined as the Legendre transform of minus the log-partition function

Φ (θ) = - \log \int_{R} e^{- 〈 θ, y 〉} d y

(defined by Laplace transform) by the following equation

S (η) = 〈 θ, η 〉 - Φ (θ) with η_{i} = \frac{\partial Φ (θ)}{\partial θ_{i}} and θ_{i} = \frac{\partial S (η)}{\partial η_{i}}

. The next step was to explain how to extend the log-partition function for Lie groups. We have then considered the Laplace transform in the framework of Lie group representation theory as introduced by Alexander Kirillov and Geometric Statical Mechanics as modeled by Jean-Marie Souriau. We have preserved the same Legendre structure, and have defined the Entropy

S (Q)

, parametrized on the dual space of the Lie algebra

Q \in g^{*}

(called geometric heat), as Legendre transform of minus of the log-partition function

Φ (β) = - \log \int_{M} e^{- 〈 J (ξ), β 〉} d λ_{ω}

, parametrized on the Lie algebra by

β \in g

(called geometric Planck Temperature), from a Laplace transform defined on the homogeneous symplectic manifold (associated to the Lie group by the Kirrilov-Kostant-Souriau 2-form called KKS 2-form in the litterature). By introducing the moment map

J : M \to g^{*}

, fundamental tool of representation theory introduced by Souriau, we were able to define the log-partition function on the coadjoint orbit of the Lie group,

Φ (β) = - \log \int_{g^{*}} e^{- 〈 J (ξ), β 〉} d λ_{ω}

. The entropy is then given by the Legendre transform

S (Q) = 〈 Q, β 〉 - Φ (β) with Q = \frac{\partial Φ (β)}{\partial β} \in g^{*} and β = \frac{\partial S (Q)}{\partial Q} \in g

. We have then defined the Gauss density for Lie groups as the density that maximizes this Entropy

S (Q)

under the constraint of its associated first moment

Q = \frac{\partial Φ (β)}{\partial β} = \int_{M} J (ξ) p_{G i b b s} (ξ) d λ_{ω}

. The Gauss density is then established by analogy with thermodynamics as the Gibbs density

p_{G i b b s} (ξ) = e^{Φ (β) - 〈 J (ξ), β 〉} = \frac{e^{- 〈 J (ξ), β 〉}}{\int_{M} e^{- 〈 J (ξ), β 〉} d λ_{ω}}

. But this is not enough, because this density is not given in the good parametrization. We have proposed to express the Gibbs density with respect to the 1st statistical moment

Q

(statistical mean of moment map) by inverting the relation

Q = \frac{\partial Φ (β)}{\partial β} = Θ (β)

. The Gibbs density

p_{G i b b s, Q} (ξ) = e^{Φ (β) - 〈 J (ξ), Θ^{- 1} (Q) 〉}

with

β = Θ^{- 1} (Q)

will provide the extended definition of Gauss density in final good parametrization.

For the time being, no “Gaussian density” was defined on Poincaré unit disk with the mandatory property to be covariant under the action of SU(1,1) Lie group that acts transitively on this homogeneous bounded domain. We have applied the previous model via computation of moment map and developed the full computation of this extended Gauss density for SU(1,1) Lie group,

S U (1, 1) = {(\begin{matrix} a & b \\ b^{*} & a^{*} \end{matrix}) / a, b \in C, {| a |}^{2} - {| b |}^{2} = 1}

and then deduced as consequence the gauss density for the Poincaré unit disk considered as the homogeneous symplectic manifold associated to the coadjoint orbit of the SU(1,1) Lie group via KKS 2 form. Considering the Lie algebra

s u (1, 1) = {(\begin{matrix} i r & η \\ η^{*} & - i r \end{matrix}) / r \in R, η \in C}

and the dual space of the Lie algebra

s u {(1, 1)}^{*} = {(\begin{matrix} z & x + i y \\ - x + i y & - z \end{matrix}) / x, y, z \in R}

, we have computed the moment map

J : D \to s u^{*} (1, 1)

defined by

J (z) . u_{i} = J_{i} (z, z^{*})

, that maps

D

the Poincaré unit disk into a coadjoint orbit in

s u^{*} (1, 1)

,

J (z) = J_{1} (z, z^{*}) u_{1}^{*} + J_{2} (z, z^{*}) u_{2}^{*} + J_{3} (z, z^{*}) u_{3}^{*} = ρ (\begin{matrix} \frac{1 + {| z |}^{2}}{1 - {| z |}^{2}} & - 2 \frac{z^{*}}{1 - {| z |}^{2}} \\ 2 \frac{z}{1 - {| z |}^{2}} & - \frac{1 + {| z |}^{2}}{1 - {| z |}^{2}} \end{matrix}) \in g^{*}

The moment map

J

is a diffeomorphism of

D

onto one sheet of the two-sheeted hyperboloid in

s u^{*} (1, 1)

, determined by the following equation

J_{1}^{2} - J_{2}^{2} - J_{3}^{2} = ρ^{2}, J_{1} \geq ρ with

J_{1} u_{1}^{*} + J_{2} u_{2}^{*} + J_{3} u_{3}^{*} \in s u^{*} (1, 1)

. But the full SU(1,1) Lie group is not related to any equilibrium Gibbs state (the open subset of the Lie algebra, associated to this Gibbs state is empty). We have then considered one-parameter subgroups of the Lie group

S U (1, 1)

such that the open subset

Λ_{β} = {β \in g / \int_{D} e^{- 〈 J (z), β 〉} d λ (z) < + \infty}

is not empty. In the neighborhood of the identity element, the elements of

g \in S U (1, 1)

can be written as the exponential of an element

β

of its Lie algebra. If we make the remark that we have the following relation

β^{2} = (\begin{matrix} i r & η \\ η^{*} & - i r \end{matrix}) (\begin{matrix} i r & η \\ η^{*} & - i r \end{matrix}) = ({| η |}^{2} - r^{2}) I

, we can developed the exponential map by a Taylor expansion of the exponential function, which is given by the following relation

g = \exp (ε β) = \sum_{k = 0}^{\infty} \frac{{(ε β)}^{k}}{k!} = (\begin{matrix} \cosh (ε R) + i r \frac{\sinh (ε R)}{R} & η \frac{\sinh (ε R)}{R} \\ η^{*} \frac{\sinh (ε R)}{R} & \cosh (ε R) - i r \frac{\sinh (ε R)}{R} \end{matrix}) with R^{2} = {| η |}^{2} - r^{2}

.

We can observe that one condition is that

{| η |}^{2} - r^{2} > 0

then the subset to consider is given by the subset

Λ_{β} = {β = (\begin{matrix} i r & η \\ η^{*} & - i r \end{matrix}), r \in R, η \in C / {| η |}^{2} - r^{2} > 0}

such that

\int_{D} e^{- 〈 J (z), β 〉} d λ (z) < + \infty

. Finally, we have computed the covariant Gibbs density in the unit disk given by

β \in Λ_{β}

and by the moment map of the Lie group

S U (1, 1)

, that could be expressed in the following equation:

p_{G i b b s} (z) = \frac{e^{- 〈 J (z), β 〉}}{\int_{D} e^{- 〈 J (z), β 〉} d λ (z)} = = \frac{e^{- 〈 ρ (\begin{matrix} \frac{1 + {| z |}^{2}}{(1 - {| z |}^{2})} & \frac{- 2 z^{*}}{(1 - {| z |}^{2})} \\ \frac{2 z}{(1 - {| z |}^{2})} & - \frac{1 + {| z |}^{2}}{(1 - {| z |}^{2})} \end{matrix}), (\begin{matrix} i r & η \\ η^{*} & - i r \end{matrix}) 〉}}{\int_{D} e^{- 〈 J (z), β 〉} d λ (z)} with d λ (z) = 2 i ρ \frac{d z \land d z^{*}}{{(1 - {| z |}^{2})}^{2}}

. To write the final Gibbs density with respect to its statistical moment, we rewrite the density with

Q = E [J (z)]

, by

β = Θ^{- 1} (Q) \in g

where

Q = \frac{\partial Φ (β)}{\partial β} = Θ (β) \in g^{*}

and

Q = E [J (z)] = E [ρ (\begin{matrix} \frac{1 + {| w |}^{2}}{(1 - {| w |}^{2})} & \frac{- 2 w^{*}}{(1 - {| w |}^{2})} \\ \frac{2 w}{(1 - {| w |}^{2})} & - \frac{1 + {| w |}^{2}}{(1 - {| w |}^{2})} \end{matrix})]

.

To extend this approach for covariant Gibbs density on Siegel Unit Disk

S D = {Z \in M_{p q} (C) / I_{p} - Z Z^{+} > 0}

, that is a classical matrix extension of Poincaré unit Disk, we have proposed to consider

G = S U (p, q)

unitary group and the homogeneous space

G / K = S U (p, q) / S (U (p), U (q)) with K = S (U (p) \times U (q)) = {(\begin{matrix} A & 0 \\ 0 & D \end{matrix}) / A \in U (p), D \in U (q), \det (A) \det (D) = 1}

and the moment map given by

J (Z) = i λ (\begin{matrix} {(I_{p} - Z Z^{+})}^{- 1} (- p Z Z^{+} - q I_{p}) & (p + q) Z {(I_{q} - Z^{+} Z)}^{- 1} \\ - (p + q) {(I_{q} - Z^{+} Z)}^{- 1} Z^{+} & (p I_{q} + q Z^{+} Z) {(I_{q} - Z^{+} Z)}^{- 1} \end{matrix})

.

After

S U (1, 1)

Lie group (case with null cohomology), we have considered the same model for

S E (2)

Lie group with non-null cohomology that needs the use of symplectic one-cocycle to manage the defect of cohomology. We have considered the special Euclidean group

S E (2) = {[\begin{matrix} R_{φ} & τ \\ 0 & 1 \end{matrix}] / R_{φ} \in S O (2), τ \in R^{2}}

with

S O (2) = {R_{φ} = [\begin{matrix} \cos φ & - \sin φ \\ \sin φ & \cos φ \end{matrix}] / φ \in R}

, and the Lie algebra

s e (2)

of

S E (2)

(ξ, u) \in s e (2) = R \times R^{2} \Rightarrow [\begin{matrix} - ξ ℑ & u \\ 0 & 0 \end{matrix}] \in s e (2)

with

ℑ = [\begin{matrix} 0 & 1 \\ - 1 & 0 \end{matrix}]

, to define the moment map

J_{(ξ, u)} (x) : R^{2} \to s e^{*} (2)

that is given by the expression

J_{(ξ, u)} (x) = J (x) . (ξ, u)

with

J (x) = - 2 (\frac{1}{2} {‖ x ‖}^{2}, - ℑ x), x \in R^{2}

. Then, the Gibbs density is deduced for generalized temperature

β \in Ω = {(b, Β) \in s e (2) / b < 0, Β \in R^{2}}

by

p_{G i b b s} (x) = \frac{e^{- 〈 J (x), β 〉}}{\int_{R^{2}} e^{- 〈 J (x), β 〉} d λ (x)} = \frac{e^{\frac{1}{2} b {‖ x ‖}^{2} - Β . ℑ x}}{\int_{R^{2}} e^{\frac{1}{2} b {‖ x ‖}^{2} - Β . ℑ x} d λ (x)}

, with the log-partition function given by the following expression

Φ (β) = \log \int_{R^{2}} e^{\frac{1}{2} b {‖ x ‖}^{2} - Β . ℑ x} d λ (x) = \log (- \frac{2 π}{b} e^{- \frac{1}{2 b} {‖ B ‖}^{2}})

with

Q = \frac{\partial Φ (β)}{\partial β} = (\frac{1}{b} - \frac{{‖ Β ‖}^{2}}{2 b^{2}}, \frac{1}{b} Β) = Θ (β)

and where

Q \in Ω^{*} = {(m, M) \in s e^{*} (2) / m + \frac{{‖ M ‖}^{2}}{2} < 0}

. To obtain the good parametrization related to statical moments, we have inverted the relation

β = Θ^{- 1} (Q) = ({(m + \frac{1}{2} {‖ M ‖}^{2})}^{- 1}, {(m + \frac{1}{2} {‖ M ‖}^{2})}^{- 1} M)

, to provide the covariant Gibbs density parametrized by

(m, M) = E (J (x)) = E [- 2 (\frac{1}{2} {‖ x ‖}^{2}, - ℑ x)] = [- E ({‖ x ‖}^{2}), 2 ℑ E (x)]

. The final Gauss density for SE(2) is then

p_{G i b b s} (x) = \frac{e^{\frac{\frac{1}{2} {‖ x ‖}^{2} - M . ℑ x}{(m + \frac{1}{2} {‖ M ‖}^{2})}}}{\int_{R^{2}} e^{\frac{\frac{1}{2} {‖ x ‖}^{2} - M . ℑ x}{(m + \frac{1}{2} {‖ M ‖}^{2})}} d λ (x)}

.

We conclude the paper by a deeper study of Souriau model structure. We observe that Souriau Entropy

S (Q)

defined on coadjoint orbit of the group has a property of invariance

S (A d_{g}^{#} (Q)) = S (Q)

with respect to Souriau affine definition of coadjoint action

A d_{g}^{#} (Q) = A d_{g}^{*} (Q) + θ (g)

where

θ (g)

is called the Souriau cocyle. In the framework of Souriau Lie groups Thermodynamics, we can then characterize the Entropy as a generalized Casimir invariant function in coadjoint representation, and Massieu characteristic function (or log-partition function), dual of Entropy by Legendre transform, as a generalized Casimir function in adjoint representation. When M is a Poisson manifold, a function on M is a Casimir function if and only if this function is constant on each symplectic leaf (the non-empty open subsets of the symplectic leaves are the smallest embedded manifolds of M which are Poisson submanifolds) [15]. Classically, the Entropy is defined axiomatically as Shannon or von Neumann Entropies without any geometric structures constraints. In this paper, the Entropy is also presented as solution of the Casimir equation

{(a d_{\frac{\partial S}{\partial Q}}^{*} Q)}_{j} + Θ {(\frac{\partial S}{\partial Q})}_{j} = C_{i j}^{k} a d_{{(\frac{\partial S}{\partial Q})}^{i}}^{*} Q_{k} + Θ_{j} = 0

with

\tilde{Θ} (X, Y) = 〈 Θ (X), Y 〉 = J_{[X, Y]} - {J_{X}, J_{Y}} = - 〈 d θ (X), Y 〉

, X, Y \in g

, where

Θ (X) = T_{e} θ (X (e))

appears in case of non-null cohomology (non-equivariance of coadjoint operator on the moment map), with

θ (g)

the Souriau Symplectic cocycle. The dual space of the Lie algebra foliates into coadjoint orbits that are also the level sets on the entropy. The KKS (Kostant-Kirillov Souriau) 2-form, and the Souriau-Koszul-Fisher metric transform each orbit into a homogeneous Symplectic manifold. The information manifold foliates into level sets of the entropy that could be interpreted in the framework of Thermodynamics by the fact that motion remaining on this complex surfaces is non-dissipative, whereas motion transversal to these surfaces is dissipative, where the dynamic is given by

\frac{d Q}{d t} = {Q, H}_{\tilde{Θ}} = a d_{\frac{\partial H}{\partial Q}}^{*} Q + Θ (\frac{\partial H}{\partial Q})

with stable equilibrium when

H = S \Rightarrow \frac{d Q}{d t} = {Q, S}_{\tilde{Θ}} = a d_{\frac{\partial S}{\partial Q}}^{*} Q + Θ (\frac{\partial S}{\partial Q}) = 0

. We have finally also observed that

d S = {\tilde{Θ}}_{β} (\frac{\partial H}{\partial Q}, β) d t

where

{\tilde{Θ}}_{β} (\frac{\partial H}{\partial Q}, β) = \tilde{Θ} (\frac{\partial H}{\partial Q}, β) + 〈 Q, [\frac{\partial H}{\partial Q}, β] 〉

, showing that Entropy production is linked with Souriau tensor related to Fisher metric.

The Casimir equations that we have introduced in non-zero cohomology case are consequences of the constancy of the entropy on adjoint orbits of the Lie algebra and of the equivariance of the map between the set of generalized temperatures and the dual space of the Lie algebra, as introduced by Jean-Marie in his 1974 paper. We explained this fact in the paper by starting elaboration of Casimir equations from the Souriau equation. Casimir equations are then presented in this context, as a fully equivalent form written in a new way, especially in the framework of Souriau Lie groups Thermodynamics. Souriau has not observed that the Entropy is an invariant Casimir function in coadjoint representation, but we can assume that he was fully aware of this invariant structure.

From Souriau equation

〈 Q, [β, Z] 〉 + \tilde{Θ} (β, Z) = 0

published in 1974, we have rewritten as direct consequence this equation on a Casimir form

a d_{\frac{\partial S}{\partial Q}}^{*} Q + Θ (\frac{\partial S}{\partial Q}) = 0

. This equation preserves the geometric structures included in Souriau equation but allow us to consider the Entropy from the point of view of Casimir invariant function. The concept of Entropy and the concept of Casimir function were, for the time being, two disjoint concepts that have been developed independently in the past. There is a large literature on Casimir function, especially the russian one that have characterized properties of Casimir function. We refer to Igor V. Shirokov who has proposed a method for constructing invariants of the coadjoint representation of Lie groups with an arbitrary dimension and structure based on local symplectic coordinates on the coadjoint orbits. With Oleg L. Kurnyavko, Igor V. Shirokov has also proposed a general method for constructing invariant Casimir functions. The second reference is about A.T. Fomenko and V.V. Trofimov who have also deeply studied Casimir functions (but in case of null cohomology) and have developed the following equation that we can write for Entropy in null cohomology case

S (A d_{e^{t ξ}}^{*} Q) = S (Q) + \sum_{n = 1}^{\infty} \frac{{(- ϕ (ξ))}^{n} S}{n!} (Q) . t^{n}

with

ϕ : g \to V e c (Γ)

a representation of Lie algebras defined on basis

(e_{1}, e_{2}, \dots, e_{n})

in

g

. We refer to a twin paper [17] developing consequences of this new definition of Entropy as an invariant Casimir function. In this twin paper, we study the associated Euler-Poincaré equation

\frac{d Q}{d t} = a d_{\frac{\partial H}{\partial Q}}^{*} Q + Θ (\frac{\partial H}{\partial Q})

and the stochastic extension based on a new Stratonovich differential equation for the stochastic process given by the following relation by mean of Souriau’s symplectic cocycle

d Q + [a d_{\frac{\partial H}{\partial Q}}^{*} Q + Θ (\frac{\partial H}{\partial Q})] d t + \sum_{i = 1}^{N} [a d_{\frac{\partial H_{i}}{\partial Q}}^{*} Q + Θ (\frac{\partial H_{i}}{\partial Q})] \circ d W_{i} (t) = 0

. These kind of stochastic equations have been also studied by Alexis Arnaudon and Daryl Holm but only in the restricted case of null-cohomology [32].

We give references from classical textbooks (as Souriau book and papers) to preprints because different approaches have been developed in parallel to address Lie groups statistics, as soon as mid of last century, but without bridges between these disciplines which have developed specific tools to address this problem. We have limited these references to main and important documents, which are characterized as seminal and as tutorial of their domains. We have preserved references in French, because some works as Souriau Lie groups Thermodynamics model have not been yet largely spread towards the different communities.

3. Learning Inference Lie Groups Thermodynamics and Covariant Gibbs Density

We identify the Riemanian metric introduced by Souriau based on cohomology, in the framework of “Lie groups thermodynamics” as an extension of classical Fisher metric introduced in information geometry. We have observed that Souriau metric preserves Fisher metric structure as the Hessian of the minus logarithm of a partition function, where the partition function is defined as a generalized Laplace transform on a sharp convex cone. Souriau’s definition of Fisher metric extends the classical one in case of Lie groups or homogeneous manifolds. Souriau has developed this “Lie groups thermodynamics” theory in the framework of homogeneous symplectic manifolds in geometric statistical mechanics for dynamical systems, but as observed by Souriau, these model equations are no longer linked to the symplectic manifold but equations only depend on the Lie group and the associated cocycle [33,34]. This analogy with Fisher metric opens potential applications in machine learning, where the Fisher metric is used in the framework of information geometry, to define the “natural gradient” tool for improving ordinary stochastic gradient descent sensitivity to rescaling or changes of variable in parameter space. In machine learning revised by natural gradient of information geometry, the ordinary gradient is designed to integrate the Fisher matrix. Amari has theoretically proved the asymptotic optimality of the natural gradient compared to classical gradient. With the Souriau approach, the Fisher metric could be extended, by Souriau-Fisher metric, to design natural gradients for data on homogeneous manifolds. Information geometry has been derived from invariant geometrical structure involved in statistical inference. The Fisher metric defines a Riemannian metric as the Hessian of two dual potential functions, linked to dually coupled affine connections in a manifold of probability distributions. With the Souriau model, this structure is extended preserving the Legendre transform between two dual potential function parametrized in Lie algebra of the group acting transentively on the homogeneous manifold.

3.1. Inference by Natutal Gradient and Legendre Structure

Classically, to optimize the parameter

θ

of a probabilistic model, based on a sequence of observations

y_{t}

, is an online gradient descent:

θ_{t} \leftarrow θ_{t - 1} - η_{t} \frac{\partial l_{t} {(y_{t})}^{T}}{\partial θ}

(18)

with learning rate

η_{t}

, and the loss function

l_{t} = - \log p (y_{t} / {\hat{y}}_{t})

. This simple gradient descent has a first drawback of using the same non-adaptive learning rate for all parameter components, and a second drawback of non invariance with respect to parameter re-encoding inducing different learning rates. Amari has introduced the natural gradient to preserve this invariance to be insensitive to the characteristic scale of each parameter direction. The gradient descent could be corrected by

I {(θ)}^{- 1}

where

I

is the Fisher information matrix with respect to parameter

θ

, given by:

I (θ) = [g_{i j}] with g_{i j} = {[- E_{y \sim p (y / θ)} [\frac{\partial^{2} \log p (y / θ)}{\partial θ_{i} \partial θ_{j}}]]}_{i j}

(19)

with natural gradient:

θ_{t} \leftarrow θ_{t - 1} - η_{t} I {(θ)}^{- 1} \frac{\partial l_{t} {(y_{t})}^{T}}{\partial θ}

(20)

Amari has proved that the Riemannian metric in an exponential family is the Fisher information matrix defined by:

g_{i j} = - {[\frac{\partial^{2} Φ}{\partial θ_{i} \partial θ_{j}}]}_{i j} with Φ (θ) = - \log \int_{R} e^{- 〈 θ, y 〉} d y

(21)

and the dual potential, the Shannon entropy, is given by the Legendre transform:

S (η) = 〈 θ, η 〉 - Φ (θ) with η_{i} = \frac{\partial Φ (θ)}{\partial θ_{i}} and θ_{i} = \frac{\partial S (η)}{\partial η_{i}}

(22)

We can observe that

Φ (θ) = - \log \int_{R} e^{- 〈 θ, y 〉} d y = - \log ψ (θ)

is linked with the cumulant generating function.

J.L. Koszul and E. Vinberg have introduced an affinely invariant Hessian metric on a sharp convex cone through its characteristic function:

\begin{array}{l} Φ_{Ω} (θ) = - \log \int_{Ω^{*}} e^{- 〈 θ, y 〉} d y = - \log ψ_{Ω} (θ) with θ \in Ω sharp convex cone \\ ψ_{Ω} (θ) = \int_{Ω^{*}} e^{- 〈 θ, y 〉} d y with Koszul-Vinberg Characteristic function \end{array}

(23)

Jean-Louis Koszul has introduced the following forms

1st Koszul form:

α = d Φ_{Ω} (θ) = - d \log ψ_{Ω} (θ)

(24)

2nd Koszul form:

γ = D α = D d \log ψ_{Ω} (θ)

(25)

with the following property of positive definitiveness:

\begin{array}{l} (D d \log ψ_{Ω} (x)) (u) = \frac{1}{ψ_{Ω} {(u)}^{2}} [\int_{Ω^{*}} F {(ξ)}^{2} d ξ . \int_{Ω^{*}} G {(ξ)}^{2} d ξ - {(\int_{Ω^{*}} F (ξ) . G (ξ) d ξ)}^{2}] > 0 \\ with F (ξ) = e^{- \frac{1}{2} 〈 x, ξ 〉} and G (ξ) = e^{- \frac{1}{2} 〈 x, ξ 〉} 〈 u, ξ 〉 \end{array}

(26)

Koszul has defined the following Diffeomorphism:

η = α = - d \log ψ_{Ω} (θ) = \int_{Ω^{*}} ξ p_{θ} (ξ) d ξ with p_{θ} (ξ) = \frac{e^{- 〈 ξ, θ 〉}}{\int_{Ω^{*}} e^{- 〈 ξ, θ 〉} d ξ}

(27)

with preservation of Legendre transform:

S_{Ω} (η) = 〈 θ, η 〉 - Φ_{Ω} (θ) with η = d Φ_{Ω} (θ) and θ = d S_{Ω} (η)

(28)

3.2. Souriau Lie Groups Thermodynamique and Souriau-Koszul-Fisher Metric

This relations have been extended by Jean-Marie Souriau in geometric statistical mechanics, where he developed a “Lie groups thermodynamics” of dynamical systems where the (maximum entropy) Gibbs density is covariant with respect to the action of the Lie group. In the Souriau model, previous structures of information geometry are preserved:

I (β) = - \frac{\partial^{2} Φ}{\partial β^{2}} with Φ (β) = - \log \int_{M} e^{- 〈 U (ξ), β 〉} d λ_{ω} and U : M \to g^{*}

(29)

S (Q) = 〈 Q, β 〉 - Φ (β) with Q = \frac{\partial Φ (β)}{\partial β} \in g^{*} and β = \frac{\partial S (Q)}{\partial Q} \in g

(30)

In the Souriau Lie groups thermodynamics model,

β

is a “geometric” (Planck) temperature, element of Lie algebra

g

of the group, and

Q

is a “geometric” heat, element of the dual space of the Lie algebra

g^{*}

of the group. Souriau has proposed a Riemannian metric that we have identified as a generalization of the Fisher metric:

I (β) = [g_{β}] with g_{β} ([β, Z_{1}], [β, Z_{2}]) = {\tilde{Θ}}_{β} (Z_{1}, [β, Z_{2}])

(31)

with {\tilde{Θ}}_{β} (Z_{1}, Z_{2}) = \tilde{Θ} (Z_{1}, Z_{2}) + 〈 Q, a d_{Z_{1}} (Z_{2}) 〉 where a d_{Z_{1}} (Z_{2}) = [Z_{1}, Z_{2}]

(32)

Souriau has proved that all co-adjoint orbit of a Lie group given by

O_{F} = {A d_{g}^{*} F = g^{- 1} F g, g \in G} subset of g^{*}, F \in g^{*}

carries a natural homogeneous symplectic structure by a closed G-invariant 2-form. If we define

K = A d_{g}^{*} = {(A d_{g^{- 1}})}^{*}

and

K_{*} (X) = - {(a d_{X})}^{*}

with

〈 A d_{g}^{*} F, Y 〉 = 〈 F, A d_{g^{- 1}} Y 〉, \forall g \in G, Y \in g, F \in g^{*}

where if

X \in g

,

A d_{g} (X) = g X g^{- 1} \in g

, the G-invariant 2-form is given by the following expression

σ_{Ω} (a d_{X} F, a d_{Y} F) = B_{F} (X, Y) = 〈 F, [X, Y] 〉, X, Y \in g

. Souriau Foundamental Theorem is that « Every symplectic manifold on which a Lie group acts transitively by a Hamiltonian action is a covering space of a coadjoint orbit ». We can observe that for Souriau model, Fisher metric is an extension of this 2-form in non-equivariant case

g_{β} ([β, Z_{1}], [β, Z_{2}]) = \tilde{Θ} (Z_{1}, [β, Z_{2}]) + 〈 Q, [Z_{1}, [β, Z_{2}]] 〉

.

The Souriau additional term

\tilde{Θ} (Z_{1}, [β, Z_{2}])

is generated by non-equivariance through Symplectic cocycle. The tensor

\tilde{Θ}

used to define this extended Fisher metric is defined by the moment map

J (x)

, application from

M

(homogeneous symplectic manifold) to the dual space of the Lie algebra

g^{*}

, given by:

\tilde{Θ} (X, Y) = J_{[X, Y]} - {J_{X}, J_{Y}}

(33)

with J (x) : M \to g^{*} such that J_{X} (x) = 〈 J (x), X 〉, X \in g

(34)

This tensor

\tilde{Θ}

is also defined in tangent space of the cocycle

θ (g) \in g^{*}

(this cocycle appears due to the non-equivariance of the coadjoint operator

A d_{g}^{*}

, action of the group on the dual space of the lie algebra; the action of the group on the dual space of the Lie algebra is modified with a cocycle so that the momentu map becomes equivariant relative to this new affine action):

Q (A d_{g} (β)) = A d_{g}^{*} (Q) + θ (g)

(35)

θ (g) \in g^{*}

is called nonequivariance one-cocycle, and it is a measure of the lack of equivariance of the moment map.

\begin{array}{l} \tilde{Θ} (X, Y) : g \times g \to ℜ with Θ (X) = T_{e} θ (X (e)) \\ X, Y \mapsto 〈 Θ (X), Y 〉 \end{array}

(36)

The cocycle should verify:

\begin{array}{l} θ (s t) = J ((s t) . x) - A d_{s t}^{*} J (x) \\ θ (s t) = [J (s . (t . x)) - A d_{s}^{*} J (t . x)] + [A d_{s}^{*} J (t . x) - A d_{s}^{*} A d_{t}^{*} J (x)] \\ θ (s t) = θ (s) + A d_{s}^{*} [J (t . x) - A d_{t}^{*} J (x)] \\ θ (s t) = θ (s) + A d_{s}^{*} θ (t) \end{array}

(37)

We can also compute tangent of one-cocycle

θ

at neutral element, to compute 2-cocycle

Θ

:

\begin{array}{l} ζ \in g, θ_{ζ} (s) = 〈 θ (s), ζ 〉 = 〈 J (s . x), ζ 〉 - 〈 A d_{s}^{*} J (x), ζ 〉 = 〈 J (s . x), ζ 〉 - 〈 J (x), A d_{s^{- 1}} ζ 〉 \\ T_{e} θ_{ζ} (ξ) = 〈 T_{x} J . ξ_{p} (x), ζ 〉 + 〈 J (x), a d_{ξ} ζ 〉 with ξ_{p} = X_{〈 J, ξ 〉} \\ T_{e} θ_{ζ} (ξ) = X_{〈 J (x), ξ 〉} [〈 J (x), ζ 〉] + 〈 J (x), [ξ, ζ] 〉 \\ T_{e} θ_{ζ} (ξ) = - {〈 J, ξ 〉, 〈 J, ζ 〉} + 〈 J (x), [ξ, ζ] 〉 = Θ (ξ) \end{array}

(38)

We can also write:

T_{x} J (ξ_{p} (x)) = - a d_{ξ}^{*} J (x) + Θ (ξ, .)

By differentiating the equation on affine action, we have:

d J (X x) = a d_{X} J (x) + d θ (X), x \in M, X \in g

(39)

\begin{array}{l} 〈 d J (X x), Y 〉 = 〈 a d_{X} J (x), Y 〉 + 〈 d θ (X), Y 〉, x \in M, X, Y \in g \\ 〈 d J (X x), Y 〉 = 〈 J (x), [X, Y] 〉 + 〈 d θ (X), Y 〉 = {〈 J, X 〉, 〈 J, Y 〉} (x) \\ 〈 J (x), [X, Y] 〉 - {〈 J, X 〉, 〈 J, Y 〉} (x) = - 〈 d θ (X), Y 〉 \end{array}

(40)

It can be then deduced that the tensor could be also written:

\tilde{Θ} (X, Y) = J_{[X, Y]} - {J_{X}, J_{Y}} = - 〈 d θ (X), Y 〉, X, Y \in g

(41)

with the cocycle property:

\tilde{Θ} ([X, Y], Z) + \tilde{Θ} ([X, Y], Z) + \tilde{Θ} ([X, Y], Z) = 0, X, Y, Z \in g

(42)

By noting the action of the group on the dual space of the Lie algebra:

G \times g^{*} \to g^{*}, (s, ξ) \mapsto s ξ = A d_{s}^{*} ξ + θ (s)

(43)

Associativity is also derived:

\begin{array}{l} (s_{1} s_{2}) ξ = A d_{s_{1} s_{2}}^{*} ξ + θ (s_{1} s_{2}) = A d_{s_{1}}^{*} A d_{s_{2}}^{*} ξ + θ (s_{1}) + A d_{s_{1}}^{*} θ (s_{2}) \\ (s_{1} s_{2}) ξ = A d_{s_{1}}^{*} (A d_{s_{2}}^{*} ξ + θ (s_{2})) + θ (s_{1}) = s_{1} (s_{2} ξ), \forall s_{1}, s_{2} \in G, ξ \in g^{*} \end{array}

(44)

This study of the moment map

J

equivariance, and the existence of an affine action of G on

g^{*}

, whose linear part is the coadjoint action, for which the moment

J

is equivariant, is at the cornerstone of Souriau theory of geometric mechanics and Lie groups thermodynamics.

3.3. Souriau Entropy and Souriau-Fisher-Koszul Metric Invariance under the Action of the Group and Covariant Souriau Gibbs Density

In Souriau’s Lie groups thermodynamics, the invariance by re-parameterization in information geometry has been replaced by invariance with respect to the action of the group. When an element of the group

g

acts on the element

β \in g

of the Lie algebra, given by adjoint operator

A d_{g}

. Under the action of the group

A d_{g} (β)

, the entropy

S (Q)

and the Fisher metric

I (β)

are invariant:

β \in g \to A d_{g} (β) \Rightarrow {\begin{cases} S [Q (A d_{g} (β))] = S (Q) \\ I [A d_{g} (β)] = I (β) \end{cases}

(45)

In the framework of Lie group action on a symplectic manifold, equivariance of moment map could be studied to prove that there is a unique action a(.,.) of the Lie group

G

on the dual

g^{*}

of its Lie algebra for which the moment map

J

is equivariant, that means for each

x \in M

:

J (Φ_{g} (x)) = a (g, J (x)) = A d_{g}^{*} (J (x)) + θ (g)

(46)

When coadjoint action is not equivariant, the symmetry is broken, and new “cohomological” relations should be verified in Lie algebra of the group. A natural equilibrium state will thus be characterized by an element of the Lie algebra of the Lie group, determining the equilibrium temperature

β

. The entropy

s (Q)

, parametrized by

Q

the geometric heat (mean of energy

U

, element of the dual space of the Lie algebra) is defined by the Legendre transform of the Massieu potential

Φ (β)

parametrized by

β

(

Φ (β)

is the minus logarithm of the partition function

ψ_{Ω} (β)

).

A Gibbs state, in the usual sense, is a statistical state at which the entropy is stationary with respect to all infinitesimal variations of the statistical state for which the mean value of the energy remains constant. In the sense of Souriau, a generalized Gibbs state is a statistical state at which the entropy is stationary with respect to all infinitesimal variations of the statistical state for which the mean value of the moment map remains constant. This generalization is very natural, since the energy can be considered as the moment map of the Hamiltonian action of the one-dimensional Lie group of time translations. Furthermore, each generalized Gibbs state is associated to an element of the Lie algebra of the group, called by Souriau a generalized temperature, and that the set of possible generalized temperature is not, in general the whole Lie algeba, but an open convex subset of the Lie algebra, which may be empty, for which some integrals encountered in the expression of the generalized Gibbs state are normally convergent. So, for some Lie groups, generalized Gibbs states do not exist, and there is no Souriau Lie groups thermodynamics.

Souriau has then defined a Gibbs density that is covariant under the action of the group:

\begin{array}{l} p_{G i b b s} (ξ) = e^{Φ (β) - 〈 U (ξ), β 〉} = \frac{e^{- 〈 U (ξ), β 〉}}{\int_{M} e^{- 〈 U (ξ), β 〉} d λ_{ω}}, with Φ (β) = - \log \int_{M} e^{- 〈 U (ξ), β 〉} d λ_{ω} \\ Q = \frac{\partial Φ (β)}{\partial β} = \frac{\int_{M} U (ξ) e^{- 〈 U (ξ), β 〉} d λ_{ω}}{\int_{M} e^{- 〈 U (ξ), β 〉} d λ_{ω}} = \int_{M} U (ξ) p (ξ) d λ_{ω} \end{array}

(47)

We can express the Gibbs density with respect to

Q

by inverting the relation

Q = \frac{\partial Φ (β)}{\partial β} = Θ (β)

. Then

p_{G i b b s, Q} (ξ) = e^{Φ (β) - 〈 U (ξ), Θ^{- 1} (Q) 〉}

with

β = Θ^{- 1} (Q)

. All Souriau equations of Lie groups Thermodynamics are illustrated in Figure 3 and Figure 4.

Souriau completed his “geometric heat theory” by introducing a 2-form in the Lie algebra, that is a Riemannian metric tensor in the values of adjoint orbit of

β

,

[β, Z]

with

Z

an element of the Lie algebra. This metric is given for

(β, Q)

:

g_{β} ([β, Z_{1}], [β, Z_{2}]) = 〈 Θ (Z_{1}), [β, Z_{2}] 〉 + 〈 Q, [Z_{1}, [β, Z_{2}]] 〉

(48)

where

Θ

is a cocycle of the Lie algebra, defined by

Θ = T_{e} θ

with

θ

a cocycle of the Lie group defined by

θ (M) = Q (A d_{M} (β)) - A d_{M}^{*} Q

.

We observe that Souriau Riemannian metric, introduced with symplectic cocycle, is a generalization of the Fisher metric, that we call the Souriau-Fisher metric, that preserves the property to be defined as a Hessian of the partition function logarithm

g_{β} = - \frac{\partial^{2} Φ}{\partial β^{2}} = \frac{\partial^{2} \log ψ_{Ω}}{\partial β^{2}}

as in classical information geometry. We will establish the equality of two terms, between Souriau definition based on Lie group cocycle

Θ

and parameterized by “geometric heat” Q (element of the dual space of the Lie algebra) and “geometric temperature” β (element of Lie algebra) and hessian of characteristic function

Φ (β) = - \log ψ_{Ω} (β)

with respect to the variable β (as illustrated in Figure 5):

g_{β} ([β, Z_{1}], [β, Z_{2}]) = 〈 Θ (Z_{1}), [β, Z_{2}] 〉 + 〈 Q, [Z_{1}, [β, Z_{2}]] 〉 = \frac{\partial^{2} \log ψ_{Ω}}{\partial β^{2}}

(49)

If we differentiate this relation of Souriau theorem

Q (A d_{g} (β)) = A d_{g}^{*} (Q) + θ (g)

, this relation occurs:

\frac{\partial Q}{\partial β} (- [Z_{1}, β], .) = \tilde{Θ} (Z_{1}, [β, .]) + 〈 Q, A d_{. Z_{1}} ([β, .]) 〉 = {\tilde{Θ}}_{β} (Z_{1}, [β, .])

(50)

- \frac{\partial Q}{\partial β} ([Z_{1}, β], Z_{2} .) = \tilde{Θ} (Z_{1}, [β, Z_{2}]) + 〈 Q, A d_{. Z_{1}} ([β, Z_{2}]) 〉 = {\tilde{Θ}}_{β} (Z_{1}, [β, Z_{2}])

(51)

\Rightarrow - \frac{\partial Q}{\partial β} = g_{β} ([β, Z_{1}], [β, Z_{2}])

(52)

As the entropy is defined by the Legendre transform of the characteristic function, a dual metric of the Fisher metric is also given by the hessian of “geometric entropy”

S (Q)

with respect to the dual variable given by Q:

\frac{\partial^{2} S (Q)}{\partial Q^{2}}

.

For the maximum entropy density (Gibbs density), the following three terms coincide:

\frac{\partial^{2} \log ψ_{Ω}}{\partial β^{2}}

that describes the convexity of the log-likelihood function,

I (β) = - E [\frac{\partial^{2} \log p_{β} (ξ)}{\partial β^{2}}]

the Fisher metric that describes the covariance of the log-likelihood gradient, whereas

I (β) = E [(ξ - Q) {(ξ - Q)}^{T}] = V a r (ξ)

that describes the covariance of the observables. We can also observe that the Fisher metric

I (β) = - \frac{\partial Q}{\partial β}

is exactly the Souriau metric defined through symplectic cocycle:

I (β) = {\tilde{Θ}}_{β} (Z_{1}, [β, Z_{2}]) = g_{β} ([β, Z_{1}], [β, Z_{2}])

(53)

The Fisher metric

I (β) = - \frac{\partial^{2} Φ (β)}{\partial β^{2}} = - \frac{\partial Q}{\partial β}

has been considered by Souriau as a generalization of “heat capacity”. Souriau called it

K

the “geometric capacity”.

3.4. Covariant Souriau Gibbs Density and Information Manifold Foliation

R.F. Streater has studied in 1999, Information Geometry for some Lie algebra where for certain unitary representation of a Lie algebra, he has defined the statistical manifold of states as convex cone for which the partition function is finite, making reference to Bogoliubov-Kubo-Mori metric. But Streater has only developed the case with null cohomology for so (3) and sl (2,R) Lie alebras. Nevertheless, as observed by R.F. Streater in his paper “Information Geometry for some Lie algebras” [35], referring to Kirillov work and Roger Balian paper, “We can expect further natural structures to arise in this case. Indeed, it is known (*) that the dual to the Lie algebra, which parametrizes the state-space in this case, foliates into coadjoint orbits; there are also the level sets on the entropy; Kirillov form, and the BKM (Bogoliubov-Kubo-Mori) metric, together make each orbit into kähler space, along the lines proposed by Kostant. Motion along these holomorphic directions is nondissipative. The transversal to the orbits is a real half-line, which represents the dissipative direction…We study the case of sl (2,R) in the discrete series of representations. We show the information manifold foliates into level sets of the entropy, each being isometric to H, the Poincaré upper half-plane… The states of constant entropy are the hyperboloids and

β

is the dissipative coordinate… For an integrable system described by a Lie algebra in a traceable representation, we find that the information manifold foliates into complex spaces; the level sets of entropy can be given a complex structure by the method of Kostant. Motion remaining on the complex surfaces is nondissipative, whereas motion transversal to these surfaces is dissipative. In information geometry, the state is parametrized by the canonical coordinates. Which function of them is measured by a thermometer? In our models, it is reasonable to designate

1 / β

to be the temperature; it is a dissipative coordinate, and it increases with time, showing that the system is thermalizing”.

4. Mathematical Definition of Souriau Moment Map

Previously, we have introduced the concept of Souriau’s moment map. In this chapter, we will introduce a mathematical definition of this tool, as defined in Souriau’s book [36] with modern notations [37,38,39,40,41]. Other details on moment map are also given in Jean-Louis Koszul’s Book [42].

4.1. Operations on Vector Fields

Consider a map

F : X \subset R^{M} \to Y \subset R^{N}

,

y = F (x)

, the derivative of

F

at

x \in X

,

D F : X \to R^{N \times M}

is given by:

(\begin{matrix} δ y^{1} \\ ⋮ \\ δ y^{N} \end{matrix}) = (\begin{matrix} \frac{\partial y^{1}}{\partial x^{1}} & \dots & \frac{\partial y^{1}}{\partial x^{M}} \\ ⋮ & ⋱ & ⋮ \\ \frac{\partial y^{N}}{\partial x^{1}} & \dots & \frac{\partial y^{N}}{\partial x^{M}} \end{matrix}) (\begin{matrix} δ x^{1} \\ ⋮ \\ δ x^{M} \end{matrix}) = D F (x) (δ x) = \underset{t \to 0}{L i m} \frac{F (x + t δ x) - F (x)}{t}

(54)

Second derivative is given by the linear map

D^{2} F : X \to R^{N \times M \times M}

:

δ [\frac{\partial y}{\partial x}] = \frac{\partial^{2} y}{\partial x^{2}} (δ x) = D^{2} F (x) (δ x)

(55)

Consider a vector Field

V

on

X \subset R^{M}

defined by:

V : X \subset R^{M} \to R^{M}

, operations on vector fields are given by adjoint action and Lie bracket:

A d_{F} V (y) = {\frac{d}{d t} [F \circ e^{t V} \circ F^{- 1}] (y) |}_{t = 0} = D F (x) (V (x)) with x = F^{- 1} (y)

(56)

[U, V] (x) = {\frac{d}{d s} A d_{e^{s U}} V (x) |}_{s = 0} = D U (x) (V (x)) - D V (x) (U (x))

(57)

0-form is a scalar, 1-form are row

ω = (ω_{1} \dots ω_{M})

in dual space. 2-forms can be regarded as antisymmetric matrices

(ω_{i j})

with

ω (u, v) = u^{t} (\begin{matrix} ω_{11} & \dots & ω_{1 M} \\ ⋮ & ⋱ & ⋮ \\ ω_{M 1} & \dots & ω_{M M} \end{matrix}) v

. m-forms are all scalar multiples of the standard volume form vol, defined by

V o l (v_{1}, \dots, v_{m}) = \det (matrix with columns v_{1}, \dots, v_{m})

.

4.2. Derivative Rules by Sophus Lie, Elie Cartan and Henri Cartan

With the following classical definitions:

Pull back: $F^{*} ω$ is a p-form on $X$

$F^{*} ω (v_{1}, \dots, v_{p}) = ω_{F (x)} (D F (x) (v_{1}), \dots, D F (x) (v_{p}))$

(58)
Interior product: $i_{V} ω$ is the (p−1)form on $M$ obtained by inserting $V (x)$ as the first argument of $ω$

$i_{V} ω (v_{2}, \dots v_{p}) = ω (V (x), v_{2}, \dots, v_{p})$

(59)
Exterior product: $θ \land ω$ is the (p + 1)-form on $X$ where $ω$ is a p-form and $θ$ is a 1-form on $M$ (where the hat indicates a term to be omitted):

$θ \land ω (v_{0}, \dots, v_{p}) = \sum_{i = 0}^{p} {(- 1)}^{i} θ (v_{i}) ω (v_{0}, \dots, {\hat{v}}_{i}, \dots, v_{p})$

(60)
Lie derivative: $L_{V} ω$ is a p-form on $M$ , and $L_{V} ω = 0$ if the flow of $V$ consists of symmetries of $ω$ :

$L_{V} ω (v_{1}, \dots, v_{p}) = {\frac{d}{d t} e^{t V *} ω (v_{1}, \dots, v_{p}) |}_{t = 0}$

(61)

$d ω$ is the (p+1)-form on $M$ defined by taking the ordinary derivative of $ω$ and then antisymmetrizing:
Exterior derivative:

$d ω (v_{0}, \dots, v_{p}) = \sum_{i = 0}^{p} {(- 1)}^{i} \frac{\partial ω}{\partial x} (v_{i}) (v_{0}, \dots, {\hat{v}}_{i}, \dots, v_{p})$

(62)

$p = 0, {[d ω]}_{i} = \partial_{i} ω; p = 1, {[d ω]}_{i j} = \partial_{i} ω_{j} - \partial_{j} ω_{i}; p = 2, {[d ω]}_{i j k} = \partial_{i} ω_{j k} + \partial_{j} ω_{k i} + \partial_{k} ω_{i j}$

(63)

From these definitions, the properties of the exterior and Lie Derivative were established by Sophus Lie, Elie Cartan, and Henri Cartan:

L_{V} ω = d i_{V} ω + i_{V} d ω

(64)

(Elie Cartan equation)

i_{[U, V]} ω = i_{V} L_{U} ω - L_{U} i_{V} ω

(65)

(Henri Cartan equation)

L_{[U, V]} ω = L_{V} L_{U} ω - L_{U} L_{V} ω

(66)

(Sophus Lie equation)

4.3. Souriau Moment Map

Considering Manifolds and Lie groups, We define the tangent bundle

T X

of

X

as the disjoint union of the

T_{x} X

, or the set of all pairs

(\begin{matrix} δ x \\ x \end{matrix})

with

x \in X

and

δ x \in T_{x} X

. If

F : X \to Y

is a smooth map between manifolds, its tangent map is the map:

F_{*} (\begin{matrix} δ x \\ x \end{matrix}) = (\begin{matrix} D F (x) (δ x) \\ F (x) \end{matrix})

(67)

A Lie group is a group

G

with a manifold structure such that the product

(g, h) \mapsto g h

and the inversion

g \mapsto g^{- 1}

are smooth maps from

G \times G

(resp. G) to

G

. Its Lie algebra is the tangent space

g = T_{e} G

at the identity element. A smooth action of

G

on a manifold

X

is a group morphism:

\begin{array}{l} Φ : G \times X \to D i f f (X) \\ (g, x) \mapsto g . x \end{array}

(68)

The orbit of

x \in X

is

G (x) = {g . x : g \in G}

.

The tangent space to an orbit at

x

:

T_{x} G (x) = {Z (x) : Z \in g} = g / g_{x}

with

Z (x) = {\frac{d}{d t} e^{t Z} (x) |}_{t = 0}

and where

g_{x} = {Z \in g : Z (x) = 0}

(69)

Let

(M, σ)

be a connected symplectic manifold. A vector field

η

on

M

is called symplectic if its flow preserves the 2-form:

L_{η} σ = 0

. If we use Elie Cartan’s formula, we can deduce that

L_{η} σ = d i_{η} σ + i_{η} d σ = 0

but as

d σ = 0

then

d i_{η} σ = 0

. We observe that the 1-form

i_{η} σ

is closed. When this 1-form is exact, there is a smooth function

x \mapsto H

on

M

with:

i_{η} σ = - d H

(70)

This vector field

η

is called Hamiltonian and could be defined as symplectic gradient

η = \nabla_{S y m p} H

.

Let a Lie group

G

that acts on

M

and that also preserve

σ

. A moment map exists if these infinitesimal generators are actually hamiltonian, so that a map

J : M \to g^{*}

exists with:

i_{Z_{X}} σ = - d H_{Z}

where

H_{Z} = 〈 J (x), Z 〉

(71)

The Poisson bracket of two functions

H

,

H^{'}

is defined by:

\begin{array}{l} {H, H^{'}} = σ (η, η^{'}) = σ (\nabla_{S y m p} H^{'}, \nabla_{S y m p} H) with i_{η} σ = - d H and \\ i_{η^{'}} σ = - d H^{'} \end{array}

(72)

If

G

is connected, then the moment map is G-equivariant if and only if it satisfies

{H_{Z}, H_{Z^{'}}} = H_{[Z, Z^{'}]}

.

Souriau has proved thet every coadjoint orbit of a Lie group is a homogeneous symplectic manifold when endowed with the KKS 2-form

σ (Z (x), Z^{'} (x)) = 〈 x, [Z^{'}, Z] 〉

, and conversely, every homogeneous symplectic manifold of a connected Lie group G is, up to a possible covering, a coadjoint orbit of some central extension of G.

σ

is G-invariant.

5. Poincaré Unit Disk, SU(1,1) Lie Group and Souriau Moment Map

We will introduce Souriau moment map for SU(1,1)/K group that acts transitively on Poincaré Unit Disk, based on moment map. More details on computation of moment map for SU(1,1)/K Lie group is given in Appendix A of this document.

5.1. Poincaré Unit Disk and SU(1,1) Lie Group

The group of complex unimodular pseudo-unitary matrices

S U (1, 1)

, is the set of elements

u

such that [43,44,45,46,47,48,49,50,51,52]:

u M u^{+} = M with M = (\begin{matrix} + 1 & 0 \\ 0 & - 1 \end{matrix})

(73)

We can show that the most general matrix

u

belongs to the Lie group given by:

G = S U (1, 1) = {(\begin{matrix} a & b \\ b^{*} & a^{*} \end{matrix}) / {| a |}^{2} - {| b |}^{2} = 1, a, b \in C}

(74)

Its Cartan decomposition is given by:

(\begin{matrix} a & b \\ b^{*} & a^{*} \end{matrix}) = | a | (\begin{matrix} 1 & z \\ z^{*} & 1 \end{matrix}) (\begin{matrix} a / | a | & 0 \\ 0 & a^{*} / | a | \end{matrix}) with z = b {(a^{*})}^{- 1}, | a | = {(1 - {| z |}^{2})}^{- 1 / 2}

(75)

(\begin{matrix} a & b \\ b^{*} & a^{*} \end{matrix}) (\begin{matrix} 1 & z \\ z^{*} & 1 \end{matrix}) = | a^{'} | (\begin{matrix} 1 & z^{'} \\ {z^{'}}^{*} & 1 \end{matrix}) (\begin{matrix} a^{'} / | a^{'} | & 0 \\ 0 & {a^{'}}^{*} / | a^{'} | \end{matrix}) with {\begin{cases} a^{'} = b z^{*} + a \\ z^{'} = \frac{a z + b}{b^{*} z + a^{*}} \end{cases}

(76)

S U (1, 1)

is associated to group of holomorphic automorphisms of the Poincaré unit disk

D = {z = x + i y \in C / | z | < 1}

in the complex plane, by considering its action on the disk as

g (z) = (a z + b) / (b^{*} z + a^{*})

. The following measure on Unit disk:

d μ_{0} (z, z^{*}) = \frac{1}{2 π i} \frac{d z \land d z^{*}}{{(1 - {| z |}^{2})}^{2}}

(77)

is invariant under the action of

S U (1, 1)

captured by the fractional holomorphic transformation:

\frac{d z^{'} \land d {z^{'}}^{*}}{{(1 - {| z^{'} |}^{2})}^{2}} = \frac{d z \land d z^{*}}{{(1 - {| z |}^{2})}^{2}}

(78)

The complex unit disk admits a Kähler structure determined by potential function:

Φ (z^{'}, z^{*}) = - \log (1 - z^{'} z^{*})

(79)

The invariant 2-form is:

Ω = \frac{1}{i} \frac{\partial^{2} Φ (z, z^{*})}{\partial z \partial z^{*}} d z \land d z^{*} = \frac{1}{i} \frac{d z \land d z^{*}}{{(1 - {| z |}^{2})}^{2}}

(80)

which is closed

d Ω = 0

. This group

S U (1, 1)

is isomorphic to the group

S L (2, R)

as a real Lie group, and the Lie algebra

g = 𝖘 u (1, 1)

is given by:

g = {(\begin{matrix} - i r & η \\ η^{*} & i r \end{matrix}) / r \in R, η \in C}

(81)

with the bases

(u_{1}, u_{2}, u_{3}) \in g

:

u_{1} = \frac{1}{2} (\begin{matrix} 0 & - i \\ i & 0 \end{matrix}), u_{2} = \frac{1}{2} (\begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix}), u_{3} = \frac{1}{2} (\begin{matrix} - i & 0 \\ 0 & i \end{matrix})

with the commutation relation:

[u_{3}, u_{2}] = u_{1}, [u_{3}, u_{1}] = - u_{2}, [u_{2}, u_{1}] = - u_{3}

(82)

Dual base on the dual space of the Lie algebra is named

(u_{1}^{*}, u_{2}^{*}, u_{3}^{*}) \in g^{*}

. The dual vector space

g^{*} = 𝖘 u^{*} (1, 1)

can be identified with the subspace of

𝖘 𝖑 (2, C)

of the form:

g^{*} = {(\begin{matrix} z & x + i y \\ - x + i y & - z \end{matrix}) = x (\begin{matrix} 0 & 1 \\ - 1 & 0 \end{matrix}) + y (\begin{matrix} 0 & i \\ i & 0 \end{matrix}) + z (\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix}) / x, y, z \in R}

(83)

Coadjoint action of

g \in G

on dual space of the Lie algebra

ξ \in g^{*}

is written

g . ξ

.

5.2. Coadjoint Orbit of SU(1,1) and Souriau Moment Map

We will use results of C. Cishahayo and S. de Bièvre [53] and B. Cahen [54,55] for computation of moment map of

S U (1, 1)

. Let

r \in R^{* +}

, orbit

O (r u_{3}^{*})

of

r u_{3}^{*}

for the coadjoint action of

g \in G

could be identified with the upper half sheet

x_{3} > 0

of

{ξ = x_{1} u_{1}^{*} + x_{2} u_{2}^{*} + x_{3} u_{3}^{*} / - x_{1}^{2} - x_{2}^{2} + x_{3}^{2} = r^{2}}

, the two-sheet hyperboloid. The stabilizer of

r u_{3}^{*}

for the coadjoint action of

G

is torus

K = {(\begin{matrix} e^{i θ} & 0 \\ 0 & e^{- i θ} \end{matrix}), θ \in R}

. K induces rotations of the unit disk, and leaves 0 invariant. The stabilizer for the origin 0 of unit disk is maximal compact subgroup K of SU(1,1). We can observe [54] that

O (r u_{3}^{*}) = G / K

. On the other hand

O (r u_{3}^{*}) = G / K

is diffeomorphic to the unit disk

D = {z \in C / | z | < 1}

, then by composition, the Souriau moment map is given by:

\begin{array}{l} J : D \to O (r u_{3}^{*}) \\ z \mapsto J (z) = r (\frac{z + z^{*}}{(1 - {| z |}^{2})} u_{1}^{*} + \frac{z - z^{*}}{i (1 - {| z |}^{2})} u_{2}^{*} + \frac{1 + {| z |}^{2}}{(1 - {| z |}^{2})} u_{3}^{*}) \end{array}

(84)

J

is linked to the natural action of

G

on

D

(by fractional linear transforms) but also the coadjoint action of

G

on

O (r u_{3}^{*}) = G / K

.

J^{- 1}

could be interpreted as the stereographic projection from the two-sphere

S^{2}

onto

C \cup \infty

[56]. In case

r = \frac{n}{2}

where

n \in N^{+}, n \geq 2

then the coadjoint orbit is given by

O_{n} = O (ζ_{n})

with

ξ_{n} = \frac{n}{2} u_{3}^{*} \in g^{*}

, with stabilizer of

ξ_{n}

for coadjoint action the torus

K = {(\begin{matrix} e^{i θ} & 0 \\ 0 & e^{- i θ} \end{matrix}), θ \in R}

with Lie algebra

R u_{3}

.

O_{n} = O (ζ_{n})

is associated with a holomorphic discrete series representation

π_{n}

of

G

by the KKS (Kirillov-Kostant-Souriau) method of orbits.

\begin{array}{l} J : D \to O_{n} \\ z \mapsto J (z) = \frac{n}{2} (\frac{z + z^{*}}{(1 - {| z |}^{2})} u_{1}^{*} + \frac{z - z^{*}}{i (1 - {| z |}^{2})} u_{2}^{*} + \frac{1 + {| z |}^{2}}{(1 - {| z |}^{2})} u_{3}^{*}) \end{array}

(85)

Group

G

act on

D

by homography

g . z = (\begin{matrix} a & b \\ b^{*} & a^{*} \end{matrix}) . z = \frac{a z + b}{b^{*} z + a^{*}}

. This action corresponds with coadjoint action of

G

on

O_{n}

. The Kirillov-Kostant-Souriau 2-form of

O_{n}

is given by:

Ω_{n} (ζ) (X (ζ), Y (ζ)) = 〈 ζ, [X, Y] 〉, X, Y \in g and ζ \in O_{n}

(86)

and is associated in the frame by

J

with:

ω_{n} = \frac{i n}{{(1 - {| z |}^{2})}^{2}} d z \land d z^{*}

(87)

with the corresponding Poisson Bracket:

{f, g} = i {(1 - {| z |}^{2})}^{2} (\frac{\partial f}{\partial z} \frac{\partial g}{\partial z^{*}} - \frac{\partial f}{\partial z^{*}} \frac{\partial g}{\partial z})

(88)

It has been also observed that there are 3 basic observables generating the

S U (1, 1)

symmetry on classical level:

{\begin{cases} D \to R \\ z \mapsto k_{3} (z) = \frac{1 + {| z |}^{2}}{1 - {| z |}^{2}} \end{cases}, {\begin{cases} D \to R \\ z \mapsto k_{1} (z) = \frac{1}{i} \frac{z - z^{*}}{1 - {| z |}^{2}} \end{cases}, {\begin{cases} D \to R \\ z \mapsto k_{2} (z) = \frac{z + z^{*}}{1 - {| z |}^{2}} \end{cases}

(89)

with the Poisson commutation rule:

{k_{3}, k_{1}} = k_{2}, {k_{3}, k_{2}} = - k_{1}, {k_{1}, k_{2}} = - k_{3}

(90)

(k_{1}, k_{2}, k_{3})

vector points to the upper sheet of the two-sheeted hyperboloid in

R^{3}

given by

k_{3}^{2} - k_{1}^{2} - k_{2}^{2} = 1

, whose the stereographic projection onto the open unit disk is:

{\begin{cases} (k_{1}, k_{2}, k_{3}) \in H^{+} \to D \\ z = \frac{k_{2} + i k_{1}}{1 + k_{3}} = \sqrt{\frac{k_{3} - 1}{k_{3} + 1}} e^{i \arg z} \end{cases}

(91)

Under the action of

g \in G = S U (1, 1) = {(\begin{matrix} a & b \\ b^{*} & a^{*} \end{matrix}) / {| a |}^{2} - {| b |}^{2} = 1, a, b \in C}

:

(\begin{matrix} k_{-} & k_{3} \\ k_{3} & k_{+} \end{matrix}) = (\begin{matrix} k_{2} + i k_{1} & k_{3} \\ k_{3} & k_{2} - i k_{1} \end{matrix}) = \frac{1}{1 - {| z |}^{2}} (\begin{matrix} 2 z & 1 + {| z |}^{2} \\ 1 + {| z |}^{2} & 2 z^{*} \end{matrix})

is transform in:

(\begin{matrix} k_{-}^{'} & k_{3}^{'} \\ k_{3}^{'} & k_{+}^{'} \end{matrix}) = (\begin{matrix} k_{-} (g^{- 1} . z) & k_{3} (g^{- 1} . z) \\ k_{3} (g^{- 1} . z) & k_{+} (g^{- 1} . z) \end{matrix}) = g^{- 1} (\begin{matrix} k_{-} & k_{3} \\ k_{3} & k_{+} \end{matrix}) {(g^{- 1})}^{t}

(92)

This transform can be viewed as the co-adjoint action of

S U (1, 1)

on the coadjoint orbit identified with

k_{3}^{2} - k_{1}^{2} - k_{2}^{2} = 1

. We can also observe that the quotient

S U (1, 1) / K

is isomorphic to the upper sheet of the hyperboloid described by

k_{3}^{2} - k_{1}^{2} - k_{2}^{2} = 1

, by the following parametrization

(τ, φ)

, given by

\vec{n} = (\cosh τ, \sinh τ \cos φ, \sinh τ \sin φ)

, and its stereographic projection onto the inside of the unit disk, parametrized by

ς = \tanh \frac{τ}{2} e^{- i φ}

.

6. Covariant Gibbs Density by Souriau Thermodynamics for Poincaré Unit Disk

6.1. Fourier Transform, Laplace Transform and Lie Group Representation Theory

In Souriau Lie Group Thermododynamic, we have to consider Laplace Transform defined on coadjoint orbits to define Massieu Potential Function and Gibbs density. This problem has been solved in the domain of Kirillov Representation Theory. Representation theory studies abstract algebraic structures by representing their elements as linear transformations of vector spaces, and algebraic objects (Lie groups, Lie algebras) by describing its elements by matrices and the algebraic operations in terms of matrix addition and matrix multiplication, reducing problems of abstract algebra to problems in linear algebra. Representation theory generalizes Fourier analysis via harmonic analysis. The modern development of Fourier analysis during XXth century has explored the generalization of Fourier and Fourier-Plancherel formula for non-commutative harmonic analysis, applied to locally compact non-Abelian groups. This has been solved by geometric approaches based on “orbits methods” (Fourier-Plancherel formula for G is given by coadjoint representation of G in dual vector space of its Lie algebra) with many contributors (Dixmier, Kirillov, Bernat, Arnold, Berezin, Kostant, Souriau, Duflo, Guichardet, Torasso, Vergne, Paradan, etc.) [57,58,59,60,61,62,63,64,65,66,67,68].

For classical commutative harmonic analysis, we consider the following groups:

\begin{matrix} G = Τ^{n} = R^{n} / Z^{n} for Fourier series, G = R^{n} for Fourier Transform \\ G group character (linked to e^{i k x}) : χ : G \to U with U = {z \in C / | z | = 1} \\ \hat{G} = {χ / χ_{1} . χ_{2} (g) = χ_{1} (g) χ_{2} (g)} and Fourier transform is given by : \\ φ : G \to C \hat{φ} : \hat{G} \to C \\ g \mapsto φ (g) = \int_{\hat{G}} \hat{φ} (χ) χ {(g)}^{- 1} d χ χ \mapsto \hat{φ} (χ) = \int_{G} φ (g) χ (g) d g \end{matrix}

(93)

For non-commutative harmonic analysis, Group unitary irreductible representation is

U : G \to U (H)

with H Hilbert space and character by

χ_{U} (g) = t r U_{g}

. Fourier transform for non-commutative group is

U_{φ} = \int_{G} φ (g) U_{g} d g

with character

χ_{U} (g) = t r U_{φ}

. If we describe group element with exponential map

U_{ψ} = \int_{g} ψ (X) U_{\exp (X)} d X

, we have:

\begin{array}{l} {trU}_{ψ} = \dim τ . μ_{G . f} (\overset{\land}{ψ . j^{- 1}}) \\ \overset{\land}{ψ . j^{- 1}} : g \to g^{*}, Four . Transf . \end{array} with {\begin{cases} μ_{G . f} : Liouville meas . on O = G . f, f \in g^{*} \\ μ_{G . f} (\overset{\land}{ψ . j^{- 1}}) : Integral of \overset{\land}{ψ . j^{- 1}} wrt μ_{G . f} \end{cases}

(94)

where

j (X) = {(\det s (a d_{X}))}^{1 / 2} with s (x) = \sum_{n = 0}^{\infty} \frac{1}{(2 n + 1)!} {(\frac{x}{2})}^{2 n} = s h (\frac{x}{2}) / (\frac{x}{2})

(95)

Kirillov Character formula is:

χ_{U} (\exp (X)) = {trU}_{\exp (X)} = j {(X)}^{- 1} \int_{O} e^{i 〈 f, X 〉} d μ_{O} (f)

(96)

\int_{O} e^{i 〈 f, X 〉} d μ_{O} (f) = j (X) {trU}_{\exp (X)} with j (X) = {(\det (\frac{e^{a d_{X} / 2} - e^{- a d_{X} / 2}}{a d_{X} / 2}))}^{1 / 2}

(97)

We will use Kirillov representation theory and his character formula to compute Souriau covariant Gibbs density in the unit Poincaré disk. For any Lie group

G

, a coadjoint orbit

O \subset g^{*}

has a canonical symplectic form

ω_{0}

given by KKS 2-form. As seen, if

G

is finite dimensional, the corresponding volume element defines a

G

-invariant measure supported on

O

, which can be interpreted as a tempered distribution. The Fourier transform (where d is the half of the dimension of the orbit O):

ℑ (x) = \int_{O \subset g^{*}} e^{- i 〈 λ, x 〉} \frac{1}{d!} d ω_{O^{d}} with λ \in g^{*} and x \in g

(98)

is Ad

G

-invariant. When

O \subset g^{*}

is an integral coadjoint orbit, Kirillov formula, given previously, expresses Fourier transform

ℑ (x)

by Kirillov character

χ_{O}

:

ℑ (x) = j (x) χ_{O} (e^{x}) where j (x) = \det^{1 / 2} (\frac{\sinh (a d (x / 2))}{a d (x / 2)})

(99)

χ_{O}

is, as defined previously, the “Kirillov character” of a unitary representation associated to the orbit.

6.2. Souriau Covariant Gibbs Density in Poincaré Unit Disk for SU(1,1) Lie Group

In the following, we will give the full development to compute the Souriau covariant Gibbs density. As the Gibbs density is not defined for all geometric temperature, as observed by Souriau, we have used his approach by considering a one-parameter subgroup of the Lie group generated by exponential map from a one element of Lie algebra given by geometric temperature. The subset of Lie algebra where the Gibbs density is deduced from the contraints related to this one-parameter subgroup generation.

Considering the Lie group

S U (1, 1) = {(\begin{matrix} a & b \\ b^{*} & a^{*} \end{matrix}) / a, b \in C, {| a |}^{2} - {| b |}^{2} = 1}

and its Lie algebra given by elements

s u (1, 1) = {(\begin{matrix} i r & η \\ η^{*} & - i r \end{matrix}) / r \in R, η \in C}

. A basis for this Lie algebra

s u (1, 1)

is

(u_{1}, u_{2}, u_{3}) \in g

with

u_{1} = \frac{i}{2} (\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix}), u_{2} = - \frac{1}{2} (\begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix}) and u_{3} = \frac{1}{2} (\begin{matrix} 0 & - i \\ i & 0 \end{matrix})

with

[u_{1}, u_{3}] = - u_{2}, [u_{1}, u_{2}] = u_{3}, [u_{2}, u_{3}] = - u_{1}

.

The compact subgroup is generated by

u_{1}

, while

u_{2}

and

u_{3}

generate a hyperbolic subgroup. The dual space of the Lie algebra is given by

s u {(1, 1)}^{*} = {(\begin{matrix} z & x + i y \\ - x + i y & - z \end{matrix}) / x, y, z \in R}

with the basis

(u_{1}^{*}, u_{2}^{*}, u_{3}^{*}) \in g^{*}

with

u_{1}^{*} = (\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix}), u_{2}^{*} = (\begin{matrix} 0 & i \\ i & 0 \end{matrix}) and u_{3}^{*} = (\begin{matrix} 0 & 1 \\ - 1 & 0 \end{matrix})

.

Let consider

D = {z \in C / | z | < 1}

be the open unit disk of Poincaré. For each

ρ > 0

, the pair

(D, ω_{ρ})

is a symplectic homogeneous manifold with

ω_{ρ} = 2 i ρ \frac{d z \land d z^{*}}{{(1 - {| z |}^{2})}^{2}}

, where

ω_{ρ}

is invariant under the action:

\begin{array}{l} S U (1, 1) \times D \to D \\ (g, z) \mapsto g . z = \frac{a z + b}{b^{*} z + a^{*}} \end{array}

.

This action is transitive and is globally and strongly Hamiltonian. Its generators are the hamiltonian vector fields associated to the functions:

J_{1} (z, z^{*}) = ρ \frac{1 + {| z |}^{2}}{1 - {| z |}^{2}}, J_{2} (z, z^{*}) = \frac{ρ}{i} \frac{z - z^{*}}{1 - {| z |}^{2}}, J_{3} (z, z^{*}) = - ρ \frac{z + z^{*}}{1 - {| z |}^{2}}

(100)

The associated moment map

J : D \to s u^{*} (1, 1)

defined by

J (z) . u_{i} = J_{i} (z, z^{*})

, maps

D

into a coadjoint orbit in

s u^{*} (1, 1)

. Then, we can write the moment map as a matrix element of

s u^{*} (1, 1)

:

\begin{array}{l} J (z) = J_{1} (z, z^{*}) u_{1}^{*} + J_{2} (z, z^{*}) u_{2}^{*} + J_{3} (z, z^{*}) u_{3}^{*} = (\begin{matrix} ρ \frac{1 + {| z |}^{2}}{1 - {| z |}^{2}} & ρ \frac{z - z^{*}}{1 - {| z |}^{2}} - ρ \frac{z + z^{*}}{1 - {| z |}^{2}} \\ ρ \frac{z - z^{*}}{1 - {| z |}^{2}} + ρ \frac{z + z^{*}}{1 - {| z |}^{2}} & - ρ \frac{1 + {| z |}^{2}}{1 - {| z |}^{2}} \end{matrix}) \\ J (z) = J_{1} (z, z^{*}) u_{1}^{*} + J_{2} (z, z^{*}) u_{2}^{*} + J_{3} (z, z^{*}) u_{3}^{*} = ρ (\begin{matrix} \frac{1 + {| z |}^{2}}{1 - {| z |}^{2}} & - 2 \frac{z^{*}}{1 - {| z |}^{2}} \\ 2 \frac{z}{1 - {| z |}^{2}} & - \frac{1 + {| z |}^{2}}{1 - {| z |}^{2}} \end{matrix}) \in g^{*} \end{array}

(101)

The moment map

J

is a diffeomorphism of

D

onto one sheet of the two-sheeted hyperboloid in

s u^{*} (1, 1)

, determined by

J_{1}^{2} - J_{2}^{2} - J_{3}^{2} = ρ^{2}, J_{1} \geq ρ with J_{1} u_{1}^{*} + J_{2} u_{2}^{*} + J_{3} u_{3}^{*} \in s u^{*} (1, 1)

. We note

O_{ρ}^{+}

the coadjoint orbit

A d_{S U (1, 1)}^{*}

of

S U (1, 1)

, given by the upper sheet of the two-sheeted hyperboloid given by previous equation. The orbit method of Kostant-Kirillov-Souriau associates to each of these coadjoint orbits a representation of the discrete series of

S U (1, 1)

, provided that

ρ

is a half integer greater or equal than 1 (

ρ = \frac{k}{2}, k \in N and ρ \geq 1

). When explicitly executing the Kostant-Kirillov construction, the representation Hilbert spaces

H_{ρ}

are realized as closed reproducing kernel subspaces of

L^{2} (D, ω_{ρ})

. The Kostant-Kirillov-Souriau orbit method shows that to each coadjoint orbit of a connected Lie group is associated a unitary irreducible representation of G acting in a Hilbert space H.

Souriau has oberved that action of the full Galilean group on the space of motions of an isolated mechanical system is not related to any equilibrium Gibbs state (the open subset of the Lie algebra, associated to this Gibbs state is empty). The main Souriau idea was to define the Gibbs states for one-parameter subgroups of the Galilean group. We will use the same approach, in this case We will consider action of the Lie group

S U (1, 1)

on the symplectic manifold (M,ω) (Poincaré unit disk) and its momentum map

J

are such that the open subset

Λ_{β} = {β \in g / \int_{D} e^{- 〈 J (z), β 〉} d λ (z) < + \infty}

is not empty. This condition is not always satisfied when (M, ω) is a cotangent bundle, but of course it is satisfied when it is a compact manifold. The idea of Souriau is to consider a one parameter subgroup of

S U (1, 1)

. To parametrize elements of

S U (1, 1)

is through its Lie algebra. In the neighborhood of the identity element, the elements of

g \in S U (1, 1)

can be written as the exponential of an element

β

of its Lie algebra:

g = \exp (ε β) with β \in g

(102)

The condition

g^{+} M g = M for M = (\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix})

can be expanded for

ε < < 1

and is equivalent to

β^{+} M + M β = 0

which then implies

β = (\begin{matrix} i r & η \\ η^{*} & - i r \end{matrix}), r \in R, η \in C

. We can observe that

r

and

η = η_{R} + i η_{I}

contain 3 degrees of freedom, as required. Also because

\det g = 1

, we get

T r (β) = 0

. We can then exponentiate

β

with exponential map to get:

g = \exp (ε β) = \sum_{k = 0}^{\infty} \frac{{(ε β)}^{k}}{k!} = (\begin{matrix} a_{ε} (β) & b_{ε} (β) \\ b_{ε}^{*} (β) & a_{ε}^{*} (β) \end{matrix})

(103)

If we make the remark that

β^{2} = (\begin{matrix} i r & η \\ η^{*} & - i r \end{matrix}) (\begin{matrix} i r & η \\ η^{*} & - i r \end{matrix}) = ({| η |}^{2} - r^{2}) I

, we can developed the exponential map:

g = \exp (ε β) = (\begin{matrix} \cosh (ε R) + i r \frac{\sinh (ε R)}{R} & η \frac{\sinh (ε R)}{R} \\ η^{*} \frac{\sinh (ε R)}{R} & \cosh (ε R) - i r \frac{\sinh (ε R)}{R} \end{matrix}) with R^{2} = {| η |}^{2} - r^{2}

(104)

We can observe that one condition is that

{| η |}^{2} - r^{2} > 0

then the subset to consider is

Λ_{β} = {β = (\begin{matrix} i r & η \\ η^{*} & - i r \end{matrix}), r \in R, η \in C / {| η |}^{2} - r^{2} > 0}

such that

\int_{D} e^{- 〈 J (z), β 〉} d λ (z) < + \infty

. The generalized Gibbs states of the full

S U (1, 1)

group do not exist. However, generalized Gibbs states for the one-parameter subgroups

\exp (α β)

,

β \in Λ_{β}

, of the

S U (1, 1)

group do exist. The generalized Gibbs state associated to

β

remains invariant under the restriction of the action to the one-parameter subgroup of

S U (1, 1)

generated by

\exp (ε β)

.

To go futher, we will develop the Souriau Gibbs density from the Souriau moment map

J (z)

and the Souriau temperature

β \in Λ_{β}

. If we note

b = \frac{1}{1 - {| z |}^{2}} [\begin{matrix} 1 \\ - z \end{matrix}]

, we can write the moment map:

J (z) = ρ (\begin{matrix} \frac{1 + {| z |}^{2}}{1 - {| z |}^{2}} & - 2 \frac{z^{*}}{1 - {| z |}^{2}} \\ 2 \frac{z}{1 - {| z |}^{2}} & - \frac{1 + {| z |}^{2}}{1 - {| z |}^{2}} \end{matrix}) = ρ (2 M b b^{+} - T r (M b b^{+}) I) with M = [\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix}]

(105)

We can the write the covariant Gibbs density in the unit disk given by moment map of the Lie group

S U (1, 1)

and geometric temperature in its Lie algebra

β \in Λ_{β}

:

p_{G i b b s} (z) = \frac{e^{- 〈 J (z), β 〉}}{\int_{D} e^{- 〈 J (z), β 〉} d λ (z)} where d λ (z) = 2 i ρ \frac{d z \land d z^{*}}{{(1 - {| z |}^{2})}^{2}}

(106)

p_{G i b b s} (z) = \frac{e^{- 〈 ρ (2 ℑ b b^{+} - T r (ℑ b b^{+}) I), β 〉}}{\int_{D} e^{- 〈 J (z), β 〉} d λ (z)} = \frac{e^{- 〈 ρ (\begin{matrix} \frac{1 + {| z |}^{2}}{(1 - {| z |}^{2})} & \frac{- 2 z^{*}}{(1 - {| z |}^{2})} \\ \frac{2 z}{(1 - {| z |}^{2})} & - \frac{1 + {| z |}^{2}}{(1 - {| z |}^{2})} \end{matrix}), (\begin{matrix} i r & η \\ η^{*} & - i r \end{matrix}) 〉}}{\int_{D} e^{- 〈 J (z), β 〉} d λ (z)}

(107)

To write the Gibbs density with respect to its statistical moments, we have to express the density with respect to

Q = E [J (z)]

. Then, we have to invert the relation between

Q

and

β

, to replace this last variable

β = (\begin{matrix} i r & η \\ η^{*} & - i r \end{matrix}) \in Λ_{β}

by

β = Θ^{- 1} (Q) \in g

where

Q = \frac{\partial Φ (β)}{\partial β} = Θ (β) \in g^{*}

with

Φ (β) = - \log \int_{D} e^{- 〈 J (z), β 〉} d λ (z)

, deduce from Legendre tranform. The mean moment map is given by:

Q = E [J (z)] = E [ρ (\begin{matrix} \frac{1 + {| w |}^{2}}{(1 - {| w |}^{2})} & \frac{- 2 w^{*}}{(1 - {| w |}^{2})} \\ \frac{2 w}{(1 - {| w |}^{2})} & - \frac{1 + {| w |}^{2}}{(1 - {| w |}^{2})} \end{matrix})] where w \in D

(108)

This mean moment map can be obtained by Karcher mean computation on the one-sheet hyperboloid corresponding to the coadjoint orbit. For the dual pairing, we can observed that

J (z) = J_{1} (z, z^{*}) u_{1}^{*} + J_{2} (z, z^{*}) u_{2}^{*} + J_{3} (z, z^{*}) u_{3}^{*} \in g^{*}

with

J_{1} (z, z^{*}) = ρ \frac{1 + {| z |}^{2}}{1 - {| z |}^{2}}, J_{2} (z, z^{*}) = \frac{ρ}{i} \frac{z - z^{*}}{1 - {| z |}^{2}}, J_{3} (z, z^{*}) = - ρ \frac{z + z^{*}}{1 - {| z |}^{2}}

and

β = β_{1} u_{1} + β_{2} u_{2} + β_{3} u 3 \in g^{*}

with

β = 2 (r, - η_{R}, - η_{I}), η = η_{R} + i η_{I}

.

The integral of normalization in Gibbs density could be computed through Kirillov character formula by

χ_{m} (\exp ((\begin{matrix} x & . \\ . & - x \end{matrix}))) = j {(x)}^{- 1} \int_{O_{m - 1}^{+}} e^{- i 〈 (\begin{matrix} x & . \\ . & - x \end{matrix}), (\begin{matrix} i r & η \\ η^{*} & - i r \end{matrix}) 〉} ω_{O_{m - 1}^{+}}

where

j (x) = \det^{1 / 2} [\sinh (a d (\begin{matrix} x / 2 \\ - x / 2 \end{matrix})) / a d (\begin{matrix} x / 2 \\ - x / 2 \end{matrix})] = \frac{\sinh (x)}{x}

with following relation

\frac{e^{m x}}{1 - e^{2 x}} j (x) = \int_{D} e^{(m - 1) x \frac{1 + {| w |}^{2}}{1 - {| w |}^{2}}} \frac{1}{{(1 - {| w |}^{2})}^{2}} d w \land d w^{*}

.

Recently, Enrico De Micheli [69] has introduced a Laplace-type transform (the so-called Spherical Laplace Transform) with a connection to the Non-Euclidean Fourier Transform in the sense of Helgason, and the principal series of the unitary representation of SU(1,1).

6.3. Extension to SU (p,q) Unitary Group for Siegel Unit Disk

Mode details are given in Appendix B, on parameterization of SU(1,1) and extension to SU (p,q). To address computation of covariant Gibbs density for Siegel Unit Disk, we will consider in this section

S U (p, q)

Unitary Group:

G = S U (p, q) and K = S (U (p) \times U (q)) = {(\begin{matrix} A & 0 \\ 0 & D \end{matrix}) / A \in U (p), D \in U (q), \det (A) \det (D) = 1}

(109)

We can use the following decomposition for

g \in G^{C}

:

g = (\begin{matrix} A & B \\ C & D \end{matrix}) \in G^{C}, g = (\begin{matrix} I_{p} & B D^{- 1} \\ 0 & I_{q} \end{matrix}) (\begin{matrix} A - B D^{- 1} C & 0 \\ 0 & D \end{matrix}) (\begin{matrix} I_{p} & 0 \\ D^{- 1} C & I_{q} \end{matrix})

(110)

and consider the action of

g \in G^{C}

on Siegel Unit Disk

S D = {Z \in M_{p q} (C) / I_{p} - Z Z^{+} > 0}

given by:

g = (\begin{matrix} A & B \\ C & D \end{matrix}) \in G^{C}, g = (\begin{matrix} I_{p} & B D^{- 1} \\ 0 & I_{q} \end{matrix}) (\begin{matrix} A - B D^{- 1} C & 0 \\ 0 & D \end{matrix}) (\begin{matrix} I_{p} & 0 \\ D^{- 1} C & I_{q} \end{matrix})

(111)

Benjamin Cahen has study this case and introduced the moment map by identifing G-equivariantly

g^{*}

with

g

by means of the Killing form

β

on

g^{C}

:

g^{*} G - equivariant with g by Killing form β (X, Y) = 2 (p + q) T r (X Y)

The set of all elements of

g

fixed by

K

is

𝖍

:

\begin{array}{l} 𝖍 = {element of G fixed by K}, ξ_{0} \in 𝖍, ξ_{0} = i λ (\begin{matrix} - q I_{p} & 0 \\ 0 & p I_{q} \end{matrix}) \\ \Rightarrow 〈 ξ_{0}, [Z, Z^{+}] 〉 = - 2 i λ {(p + q)}^{2} T r (Z Z^{+}), \forall Z \in D \end{array}

(112)

Then, we the equivatiant moment map is given by:

\begin{array}{l} \forall X \in g^{C}, Z \in D, ψ (Z) = A d^{*} (\exp (- Z^{+}) ζ (\exp Z^{+} \exp Z)) ξ_{0} \\ \forall g \in G, Z \in D then ψ (g . Z) = A d_{g}^{*} ψ (Z) \\ ψ is a diffeomorphism from S D onto orbit O (ξ_{0}) \end{array}

(113)

with:

ψ (Z) = i λ (\begin{matrix} {(I_{p} - Z Z^{+})}^{- 1} (- p Z Z^{+} - q I_{p}) & (p + q) Z {(I_{q} - Z^{+} Z)}^{- 1} \\ - (p + q) {(I_{q} - Z^{+} Z)}^{- 1} Z^{+} & (p I_{q} + q Z^{+} Z) {(I_{q} - Z^{+} Z)}^{- 1} \end{matrix})

(114)

ζ (\exp Z^{+} \exp Z) = (\begin{matrix} I_{p} & Z {(I_{q} - Z^{+} Z)}^{- 1} \\ 0 & I_{q} \end{matrix})

(115)

7. Lie Groups Thermodynamics for SE(2) Lie Group

After

S U (1, 1)

Lie group with null cohomology and then without Souriau one-cocycle, we will consider Souriau model for

S E (2)

Lie group with non-null cohomology and then with introduction of Souriau one-cocycle [70].

We will consider first

S O (2)

Lie group:

S O (2) = {R_{φ} = [\begin{matrix} \cos φ & - \sin φ \\ \sin φ & \cos φ \end{matrix}] / φ \in R}

(116)

A vector at the identity to

S O (2)

is given by:

{\frac{d R_{t η}}{d t} |}_{t = 0} = - η ℑ with ℑ = [\begin{matrix} 0 & 1 \\ - 1 & 0 \end{matrix}], ℑ^{T} = ℑ^{- 1} = - ℑ

(117)

We consider the special Euclidean group

S E (2) = S O (2) \times R^{2}

.

S E (2) = {[\begin{matrix} R_{φ} & τ \\ 0 & 1 \end{matrix}] / R_{φ} \in S O (2), τ \in R^{2}}

(118)

the group operation is given by:

\begin{array}{l} [\begin{matrix} R_{φ_{1}} & τ_{1} \\ 0 & 1 \end{matrix}] [\begin{matrix} R_{φ_{2}} & τ_{2} \\ 0 & 1 \end{matrix}] = [\begin{matrix} R_{φ_{1}} R_{φ_{2}} & R_{φ_{1}} τ_{2} + τ_{1} \\ 0 & 1 \end{matrix}] = [\begin{matrix} R_{φ_{1} + φ_{2}} & R_{φ_{1}} τ_{2} + τ_{1} \\ 0 & 1 \end{matrix}] \\ \Rightarrow (R_{_{1}}, τ_{1}) . (R_{φ_{2}}, τ_{2}) = (R_{φ_{1} + φ_{2}}, R_{φ_{1}} τ_{2} + τ_{1}) \end{array}

(119)

{[\begin{matrix} R_{φ_{1}} & τ_{1} \\ 0 & 1 \end{matrix}]}^{- 1} = [\begin{matrix} R_{- φ_{1}} & - R_{- φ_{1}} τ_{1} \\ 0 & 1 \end{matrix}] \Rightarrow {(R_{φ_{1}}, τ_{1})}^{- 1} = (R_{- φ_{1}}, - R_{- φ_{1}} τ_{1})

(120)

The Lie algebra

s e (2)

of

S E (2)

has underlying vector space

R^{3}

and Lie bracket:

(ξ, u) \in s e (2) = R \times R^{2} \Rightarrow [\begin{matrix} - ξ ℑ & u \\ 0 & 0 \end{matrix}] \in s e (2)

(121)

Lie bracket is given by:

[(ξ, u), (η, v)] = (0, ξ ℑ v + η ℑ u)

(122)

Adjoint action of

S E (2)

is given by:

\begin{array}{l} A d_{(R_{φ}, τ)} (ξ, u) = [\begin{matrix} R_{φ} & τ \\ 0 & 1 \end{matrix}] [\begin{matrix} - ξ ℑ & u \\ 0 & 0 \end{matrix}] [\begin{matrix} R_{- φ} & - R_{- φ} τ \\ 0 & 1 \end{matrix}] = [\begin{matrix} - ξ ℑ & ξ ℑ τ + R_{φ} u \\ 0 & 0 \end{matrix}] \\ A d_{(R_{φ}, τ)} (ξ, u) = (ξ, R_{φ} u + ξ ℑ τ) \end{array}

(123)

Coadjoint action of

S E (2)

is given by:

A d_{(R_{φ}, τ)}^{*} (m, ρ) = (m + ℑ R_{φ} ρ . τ, R_{φ} ρ)

(124)

The moment map

J : R^{2} \to s e^{*} (2)

of

S E (2)

is defined by:

J_{(ξ, u)} (x) = J (x) . (ξ, u)

(125)

with the right action of

S E (2)

on

R^{2}

:

x . (R φ, τ) = R_{- φ} (x - τ)

(126)

the infinitesimal generator of

(ξ, u) \in s e (2)

has the expression:

{(ξ, u)}_{R^{2}} (x) = {\frac{d [x . (R_{t ξ}, t u)]}{d t} |}_{t = 0} = {\frac{d [R_{- t ξ} (x - t u)]}{d t} |}_{t = 0} = ξ ℑ x - u

(127)

Let

J_{(ξ, u)} (x) : R^{2} \to s e^{*} (2)

be the moment map of this action relative to the symplectic form, we can compute it from its definition:

\begin{array}{l} d J_{(ξ, u)} (x) . y = - 2 ω ({(ξ, u)}_{R^{2}}, y) \\ with ω ({(ξ, u)}_{R^{2}}, y) = ω (ξ ℑ x - u, y) = (ξ ℑ x - u) . ℑ y = (ξ x + ℑ u) . y \\ \Rightarrow d J_{(ξ, u)} (x) . y = - 2 (ξ x + ℑ u) . y \\ \Rightarrow J_{(ξ, u)} (x) = - 2 (\frac{1}{2} ξ {‖ x ‖}^{2} + ℑ u . x) = - 2 (\frac{1}{2} {‖ x ‖}^{2}, - ℑ x) . (ξ, u) \\ J_{(ξ, u)} (x) = J (x) . (ξ, u) \Rightarrow J (x) = - 2 (\frac{1}{2} {‖ x ‖}^{2}, - ℑ x), x \in R^{2} \end{array}

(128)

We then compute the one-cocycle of

S E (2)

from the moment map:

\begin{array}{l} θ ((R_{φ, τ})) = J (0 . (R_{φ}, τ)) - A d_{(R_{φ}, τ)}^{*} J (0) = J (- R_{- φ} τ) \\ θ ((R_{φ, τ})) = - 2 (\frac{1}{2} {‖ τ ‖}^{2}, ℑ R_{- φ} τ) = - 2 (\frac{1}{2} {‖ τ ‖}^{2}, R_{- φ - \frac{π}{2}} τ) \end{array}

(129)

Coadjoint orbit of

S E (2)

are generated by:

\begin{array}{l} O_{(m, ρ)} = {A_{(R_{φ, τ})}^{*} (m, ρ) + θ ((R_{φ}, τ)) / (R_{φ}, τ) \in S E (2)} \\ O_{(m, ρ)} = {(x - R_{- \frac{π}{2}} ρ . τ - {‖ τ ‖}^{2}, R_{- φ} ρ - 2 R_{- φ - \frac{π}{2}} τ) / (R_{φ}, τ) \in S E (2)} \end{array}

(130)

The Souriau Symplectic form in this case of non-null cohomology is given by:

\begin{array}{l} ω_{(m, ρ) (m^{'}, ρ^{'})} (a d_{(ξ, u)}^{*} (m^{'}, ρ^{'}) - (0, 2 ℑ u), a d_{(η, v)}^{*} (m^{'}, ρ^{'}) - (0, 2 ℑ v)) = ρ^{'} . (- ξ ℑ v + η ℑ u) + 2 u . ℑ v \\ with (m^{'}, ρ^{'}) = (x - R_{- \frac{π}{2}} ρ . τ - {‖ τ ‖}^{2}, R_{- φ} ρ - 2 R_{- φ - \frac{π}{2}} τ) \in O_{(m, ρ)} \subset R^{3} \end{array}

(131)

With the expression of moment map, we can compute Souriau covariant Gibbs density of Maximum Entropy.

Considering the symplectic form

ω (ζ, υ) = ζ . ℑ υ with ℑ = [\begin{matrix} 0 & 1 \\ - 1 & 0 \end{matrix}]

on

R^{2}

, we have seen that the action of SE(2) is symplectic and admits the momentum map,

J (x) = - (\frac{1}{2} {‖ x ‖}^{2}, - ℑ x), x \in R^{2}

.

Souriau Gibbs density is defined for generalized temperature

β \in Ω = {(b, Β) \in s e (2) / b < 0, Β \in R^{2}}

and given by:

p_{G i b b s} (x) = \frac{e^{- 〈 J (x), β 〉}}{\int_{R^{2}} e^{- 〈 J (x), β 〉} d λ (x)} = \frac{e^{\frac{1}{2} b {‖ x ‖}^{2} - Β . ℑ x}}{\int_{R^{2}} e^{\frac{1}{2} b {‖ x ‖}^{2} - Β . ℑ x} d λ (x)}

(132)

The Massieu Potential could be computed:

Φ (β) = \log \int_{R^{2}} e^{\frac{1}{2} b {‖ x ‖}^{2} - Β . ℑ x} d λ (x) = \log (- \frac{2 π}{b} e^{- \frac{1}{2 b} {‖ B ‖}^{2}})

(133)

By derivation of Massieu potential, we can deduce expression of Heat:

\begin{array}{l} Q \in Ω^{*} = {(m, M) \in s e^{*} (2) / m + \frac{{‖ M ‖}^{2}}{2} < 0} \\ Q = \frac{\partial Φ (β)}{\partial β} = (\frac{1}{b} - \frac{{‖ Β ‖}^{2}}{2 b^{2}}, \frac{1}{b} Β) = Θ (β) \end{array}

(134)

We can the inverse this relation to express generalized temperature with respect to the heat:

β = Θ^{- 1} (Q) = ({(m + \frac{1}{2} {‖ M ‖}^{2})}^{- 1}, {(m + \frac{1}{2} {‖ M ‖}^{2})}^{- 1} M)

(135)

We can the express the Gibbs density with respect to the Heat Q which is the mean of moment map:

p_{G i b b s} (x) = \frac{e^{\frac{\frac{1}{2} {‖ x ‖}^{2} - M . ℑ x}{(m + \frac{1}{2} {‖ M ‖}^{2})}}}{\int_{R^{2}} e^{\frac{\frac{1}{2} {‖ x ‖}^{2} - M . ℑ x}{(m + \frac{1}{2} {‖ M ‖}^{2})}} d λ (x)} with (m, M) = E (J (x)) = E [- 2 (\frac{1}{2} {‖ x ‖}^{2}, - ℑ x)] = [- E ({‖ x ‖}^{2}), 2 ℑ E (x)]

(136)

So we can rewrite the Gibbs density:

p_{G i b b s} (x) = \frac{e^{\frac{\frac{1}{2} {‖ x ‖}^{2} + 2 E (x) . I x}{(- E ({‖ x ‖}^{2}) + 2 {‖ E (x) ‖}^{2})}}}{\int_{R^{2}} e^{\frac{\frac{1}{2} {‖ x ‖}^{2} + 2 E (x) . I x}{(- E ({‖ x ‖}^{2}) + 2 {‖ E (x) ‖}^{2})}} d λ (x)}

(137)

We can also provide a Fisher metric in dual Lie algebra as hessian of the Entropy:

S (Q) = 〈 Q, β 〉 - Φ (β) = 1 + \log (2 π) + \log (- m - \frac{{‖ M ‖}^{2}}{2})

(138)

I_{F i s h e r} (Q) = {(m + \frac{1}{2} {‖ M ‖}^{2})}^{- 1} [\begin{matrix} I & M^{T} \\ M^{T} & \frac{1}{2} {‖ M ‖}^{2} - m \end{matrix}]

(139)

and as

(m, M) = E (J (x)) = E [- 2 (\frac{1}{2} {‖ x ‖}^{2}, - ℑ x)] = [- E ({‖ x ‖}^{2}), 2 ℑ E (x)]

, Fisher metric in dual space of Lie Algebra parameterization could be written:

I_{F i s h e r} (Q) = {(2 {‖ E (x) ‖}^{2} - E ({‖ x ‖}^{2}))}^{- 1} [\begin{matrix} I & {(2 ℑ E (x))}^{T} \\ 2 ℑ E (x) & 2 {‖ E (x) ‖}^{2} + E ({‖ x ‖}^{2}) \end{matrix}]

(140)

8. New Entropy Definition as Generalized Casimir Invariant Functions for Coadjoint and Adjoint Representation

In his paper written in 1974, Jean-Marie Souriau has observed that if we consider the heat expression

Q = \frac{d Φ}{d β}

, that we can write

δ Φ - 〈 Q, δ β 〉 = 0

. For each

δ β

tangent to the orbit, and so generated by an element

Z

of the Lie algebra, if we consider the relation

Φ (A d_{g} (β)) = Φ (β) - 〈 θ (g^{- 1}), β 〉

and we differentiate it at

g = e

using the property that

\tilde{Θ} (X, Y) = - 〈 d θ (X), Y 〉, X, Y \in g

, we obtain

〈 Q, [β, Z] 〉 + \tilde{Θ} (β, Z) = 0

. Souriau has stopped by this last equation, the characterization of Group action on

Q = \frac{\partial Φ}{\partial β}

. Souriau has also observed that

S [Q (A d_{g} (β))] = S [A d_{g}^{*} (Q) + θ (g)] = S (Q)

. We propose to characterize more explicitly this invariance, by characterizing Entropy as an invariant Casimir function in coadjoint representation.

From last Souriau equation, if we use the identities

β = \frac{\partial S}{\partial Q}

,

a d_{β} Z = [β, Z]

and

\tilde{Θ} (β, Z) = 〈 Θ (β), Z 〉

, then we can deduce that

〈 a d_{\frac{\partial S}{\partial Q}}^{*} Q + Θ (\frac{\partial S}{\partial Q}), Z 〉 = 0, \forall Z

. So, Entropy

S (Q)

should verify

a d_{\frac{\partial S}{\partial Q}}^{*} Q + Θ (\frac{\partial S}{\partial Q}) = 0

, characterizes an invariant Casimir function in case of non-null cohomology, that we propose to write with Poisson brackets,

{S, H}_{\tilde{Θ}} (Q) = 0

where

{S, H}_{\tilde{Θ}} (Q) = 〈 Q, [\frac{\partial S}{\partial Q}, \frac{\partial H}{\partial Q}] 〉 + \tilde{Θ} (\frac{\partial S}{\partial Q}, \frac{\partial H}{\partial Q}) = 0

,

\forall H : g^{*} \to R, Q \in g^{*}

.

In a Poisson manifold, Casimir functions

S \in C^{\infty} (g^{*})

, in case of null cohomology, are functions whose Poisson brackets will all functions vanish,

{S, H} (Q) = 0, \forall S \in C^{\infty} (g^{*}), Q \in g^{*}

. In the dual of the Lie algebra of a connected Lie group

G

, the Casimir functions are the

A d^{*}

-invariant functions, because if

S, H \in C^{\infty} (g^{*})

and

Q \in g^{*}

, then

{S, H} (Q) = 〈 Q, [\frac{\partial S}{\partial Q}, \frac{\partial H}{\partial Q}] 〉 = 〈 Q, a d_{\frac{\partial S}{\partial Q}} \frac{\partial H}{\partial Q} 〉 = 〈 a d_{\frac{\partial S}{\partial Q}}^{*} Q, \frac{\partial H}{\partial Q} 〉

vanishes for all

H \in C^{\infty} (g^{*})

if and only if

a d_{\frac{\partial S}{\partial Q}}^{*} Q = 0

. A function is

S

on

g^{*}

is

A d^{*}

-invariant if

g . S = S, \forall g \in G

where Lie group

G

acts on functions on

g^{*}

by

(g . S) (Q) = S (A d_{g}^{*} Q), Q \in g^{*}, S \in C^{\infty} (g^{*}), g \in G

, and where infinitesimal characterizations of

A d^{*}

-invariant functions on

g^{*}

,

{\frac{d}{d t} S (A d_{\exp (t x)}^{*} Q) |}_{t = 0} = 〈 a d_{x}^{*} Q, \frac{\partial S}{\partial Q} 〉 = - 〈 a d_{\frac{\partial S}{\partial Q}}^{*} Q, x 〉

. The symplectic leaves of a Poisson manifold are contained in the connected components of the level sets of the Casimir functions and Casimir function is constant on a symplectic leaf. Coadjoint orbits lie on level sets of the Casimir functions, which are conserved quantities. Casimir functions Level sets are symplectic manifolds. Coadjoint motion of the moment map

Q (t) = A d_{g (t)}^{*} Q (0)

for a solution curve

g (t) \in C (G)

take place on the intersections of levels sets of the Hamiltonian and the Casimir functions. Alexis Arnaudon has studied stochastic coadjoint processes whose solutions lie on coadjoint orbits.

We have observed that

{S, H}_{\tilde{Θ}} (Q) = 〈 Q, [\frac{\partial S}{\partial Q}, \frac{\partial H}{\partial Q}] 〉 + \tilde{Θ} (\frac{\partial S}{\partial Q}, \frac{\partial H}{\partial Q}) = 0, \forall H : g^{*} \to R, Q \in g^{*}

, that shows that Souriau Entropy is a Casimir function in case with non-null cohomology when an additional cocycle should be taken into account. Indeed, infinitesimal variation is characterized by the following differentiation:

{\frac{d}{d t} S (Q (A d_{\exp (t x)} β)) |}_{t = 0} = {\frac{d}{d t} S (A d_{\exp (t x)}^{*} Q + θ (\exp (t x))) |}_{t = 0} = - 〈 a d_{\frac{\partial S}{\partial Q}}^{*} Q + Θ (\frac{\partial S}{\partial Q}), x 〉

. We recover extended Casimir equation in case of non-null cohomology verified by Entropy,

a d_{\frac{\partial S}{\partial Q}}^{*} Q + Θ (\frac{\partial S}{\partial Q}) = 0

, and then the generalized Casimir condition

{S, H}_{\tilde{Θ}} (Q) = 0

. Hamiltonian motion on these affine coadjoint orbits is given by the solutions of the Lie-Poisson equations with cocycle.

The identification of Entropy as an Invariant Casimir Function in Coadjoint representation is also important in Information Theory, because classically Entropy is introduced axiomatically. With this new approach, we can build Entropy by constructing the Casimir Function associated to the Lie group and also in case of non-null cohomology. Igor V. Shirokov [71,72,73,74,75] has proposed a method for constructing invariants of the coadjoint representation of Lie groups with an arbitrary dimension and structure based on local symplectic coordinates on the coadjoint orbits. The idea of the method of constructing coadjoint invariants is to construct the canonical transition to the Darboux coordinates on the orbits of the dual Lie algebra

g^{*}

of maximal dimension dual to the Lie algebra

g

of the Lie group

G

. These relations provide invariants of the coadjoint representation of the Lie group

G

.

This geometric framework unifies several earlier works on the subject, including Souriau’s symplectic model of statistical mechanics, and approaches developed in Information Geometry and Quantum Information Geometry. This approach helps to identify the common geometric structures appearing in various domains from statistical mechanics to statistical learning. The emphasis is put on the role of the affine equivariance with respect to Lie group actions, as extension of the Fisher metric in presence of equivariance and the associated Lie-Poisson equations with cocycle (affine Lie-Poisson equations). The entropy of the Souriau model as a Casimir function can be used to apply a geometric model for energy preserving entropy production on Lie algebras. We can exploit the geometric framework of this new equation to build geometric numerical integrator schemes for some of the equations associated to Souriau’s model and its polysymplectic extension. This new equation is important because it introduce new structure of differential equations in case of non-null cohomology and for an arbitrary Hamiltonian

H : g^{*} \to R

:

\frac{d Q}{d t} = a d_{\frac{\partial H}{\partial Q}}^{*} Q + Θ (\frac{\partial H}{\partial Q})

.

The equation

\frac{d Q}{d t} = a d_{\frac{\partial H}{\partial Q}}^{*} Q + Θ (\frac{\partial H}{\partial Q})

is important because it allows extending stochastic perturbation of the Lie-Poisson equation with cocycle within the setting of stochastic Hamiltonian dynamics, which preserves the affine coadjoint orbits. We can extend model for stochastic geometric modeling in fluid dynamics via variational principles described in [32,76]. This extension results in the new Stratonovich differential equation for the stochastic process

d Q + [a d_{\frac{\partial H}{\partial Q}}^{*} Q + Θ (\frac{\partial H}{\partial Q})] d t + \sum_{i = 1}^{N} [a d_{\frac{\partial H_{i}}{\partial Q}}^{*} Q + Θ (\frac{\partial H_{i}}{\partial Q})] \circ d W_{i} (t) = 0

.

This new equation is also very usefull for geometric symplectic Lie group integrator for Lie-Poisson equations with cocycle that preserves the affine coadjoint orbits for general Hamiltonian. This equation is also very relevant in the framework of dynamics with Casimir dissipation/production, to formulate a dynamical geometric model for dissipation/production of this Casimir. This allows to extend the general Lie algebraic approach developed in [77,78] for Casimir dissipation, to take into account of a cocycle, and to a wider class of dissipation. Paper [17] will exploit this new Casimir equation in case of non-null cohomology.

This equation

\frac{d Q}{d t} = a d_{\frac{\partial H}{\partial Q}}^{*} Q + Θ (\frac{\partial H}{\partial Q})

could be used also to make the link with 2nd principle of Thermodynamique, that will be deduced from positivity of Souriau-Fisher metric:

\begin{array}{l} S (Q) = 〈 Q, β 〉 - Φ (β) with \frac{d Q}{d t} = a d_{\frac{\partial H}{\partial Q}}^{*} Q + Θ (\frac{\partial H}{\partial Q}) \\ \frac{d S}{d t} = 〈 Q, \frac{d β}{d t} 〉 + 〈 a d_{\frac{\partial H}{\partial Q}}^{*} Q + Θ (\frac{\partial H}{\partial Q}), β 〉 - \frac{d Φ}{d t} = 〈 Q, \frac{d β}{d t} 〉 + 〈 a d_{\frac{\partial H}{\partial Q}}^{*} Q, β 〉 + + 〈 Θ (\frac{\partial H}{\partial Q}), β 〉 - \frac{d Φ}{d t} \\ \frac{d S}{d t} = 〈 Q, \frac{d β}{d t} 〉 + 〈 Q, [\frac{\partial H}{\partial Q}, β] 〉 + \tilde{Θ} (\frac{\partial H}{\partial Q}, β) - \frac{d Φ}{d t} = 〈 Q, \frac{d β}{d t} 〉 + {\tilde{Θ}}_{β} (\frac{\partial H}{\partial Q}, β) - 〈 \frac{\partial Φ}{\partial β}, \frac{d β}{d t} 〉 \\ \frac{d S}{d t} = 〈 Q, \frac{d β}{d t} 〉 + {\tilde{Θ}}_{β} (\frac{\partial H}{\partial Q}, β) - 〈 \frac{\partial Φ}{\partial β}, \frac{d β}{d t} 〉 with \frac{\partial Φ}{\partial β} = Q \\ \frac{d S}{d t} = {\tilde{Θ}}_{β} (\frac{\partial H}{\partial Q}, β) \geq 0, \forall H (link to positivity of Fisher metric) \\ if H = S \underset{\frac{\partial S}{\partial Q} = Q}{\Rightarrow} \frac{d S}{d t} = {\tilde{Θ}}_{β} (β, β) = 0 because β \in K e r {\tilde{Θ}}_{β} \end{array}

(141)

Entropy production is then linked with Souriau-Fisher structure,

d S = {\tilde{Θ}}_{β} (\frac{\partial H}{\partial Q}, β) d t

with

{\tilde{Θ}}_{β} (\frac{\partial H}{\partial Q}, β) = \tilde{Θ} (\frac{\partial H}{\partial Q}, β) + 〈 Q, [\frac{\partial H}{\partial Q}, β] 〉

Souriau tensor related to Fisher metric.

8.1. Casimir Invariant and Generalized Casimir Invariant

Hendrik Brugt Gerhard Casimir, a Dutch physicist, studied what is called Casimir operators and Casimir invariants (H. Casimir and Van der Waerden studied the SU(2) group, the group of isospin/angular momentum, as the model of the algebraic approach to the study of the unitary representations of semi-simple compact Lie groups). Kirillov has explained that Casimir operators are in one-to-one correspondence with polynomial invariants characterizing orbits of the coadjoint representation. Solutions are not necessarily polynomials and the nonpolynomial solutions are called generalized Casimir invariants. For certain classes of Lie algebras, all invariants of the coadjoint representation are functions of polynomial ones. In physics, Hamiltonians and integrals of motion of classical integrable Hamiltonian systems are not polynomials in the momenta [71,72,73,74,75,79,80,81,82,83,84,85,86,87,88,89,90,91,92].

8.2. Souriau Entropy as Generalized Casimir Invariant in Coadjoint Representation

In Souriau Lie groups Thermodynamics, we will see that coadjoint orbits lie on level sets of the Entropy that could be considered as a Casimir invariant function:

\begin{array}{l} S : g^{*} \to R \\ Q \mapsto S (Q) \end{array}

(142)

We will consider first the case of null-cohomology, Entropy as Casimir invariant function is a conserved quantity, because Casimir function has null Lie Poisson brackets functions [93,94]:

\begin{array}{l} {S, H} (Q) = 〈 Q, [\frac{\partial S}{\partial Q}, \frac{\partial H}{\partial Q}] 〉 = 0, \forall H : g^{*} \to R, Q \in g^{*}, 〈 A, B 〉 = B (A, B) Cartan-Killing form \\ with \partial S (Q) = {\frac{d}{d ε} S (Q + δ Q) |}_{ε = 0} = 〈 δ Q, \frac{\partial S}{\partial Q} 〉 \end{array}

(143)

We can observe that

β = \frac{\partial S}{\partial Q}

, then:

〈 Q, [β, \frac{\partial H}{\partial Q}] 〉 = 〈 Q, a d_{β} \frac{\partial H}{\partial Q} 〉 = 0, \forall H : g^{*} \to R, Q \in g^{*}, a d_{a} b = [a, b]

(144)

We can also write:

〈 Q, [\frac{\partial S}{\partial Q}, \frac{\partial H}{\partial Q}] 〉 = 〈 Q, a d_{\frac{\partial S}{\partial Q}} \frac{\partial H}{\partial Q} 〉 = 〈 a d_{\frac{\partial S}{\partial Q}}^{*} Q, \frac{\partial H}{\partial Q} 〉 = 0, \forall H : g^{*} \to R

(145)

It means that

a d_{\frac{\partial S}{\partial Q}}^{*} Q = a d_{β}^{*} Q = 0, β = \frac{\partial S}{\partial Q}

. We can remark that if we note

{(a d_{\frac{\partial S}{\partial Q}}^{*} Q)}_{j} = C_{i j}^{k} a d_{{(\frac{\partial S}{\partial Q})}^{i}}^{*} Q_{k} = 0

with

C_{i j}^{k}

the structure tensor, we observe that this equation is in fact the Casimir condition for invariant function in coadjoint representation as we will see hereafter. The restriction of the Lie-Poisson bracket to an orbit generates a symplectic structure on the orbit, called the KKS (Kirillov-Kostant-Souriau) structure, or the canonical symplectic structure. Casimir function is characterized as a quantity which commutes with each linear functional on the Poisson manifold, and then it is conserved by dynamics of any Hamiltonian.

Given a Hamiltonian

H : g^{*} \to R

, the equation of motion for

Q \in g^{*}

is:

\frac{d Q}{d t} = {Q, H} = a d_{\frac{\partial H}{\partial Q}}^{*} Q with H = S \Rightarrow \frac{d Q}{d t} = {Q, S} = a d_{\frac{\partial S}{\partial Q}}^{*} Q = 0

(146)

In case of non-null cohomology, the Lie Poisson brackets functions are given by:

\begin{array}{l} {S, H}_{\tilde{Θ}} (Q) = 〈 Q, [\frac{\partial S}{\partial Q}, \frac{\partial H}{\partial Q}] 〉 + \tilde{Θ} (\frac{\partial S}{\partial Q}, \frac{\partial H}{\partial Q}) = 0, \forall H : g^{*} \to R, Q \in g^{*} \\ with \tilde{Θ} (X, Y) = J_{[X, Y]} - {J_{X}, J_{Y}} where J_{X} (x) = 〈 J (x), X 〉 \\ \tilde{Θ} (X, Y) : g \times g \to ℜ with Θ (X) = T_{e} θ (X (e)) \\ X, Y \mapsto 〈 Θ (X), Y 〉 \end{array}

(147)

That we can develop in the following:

\begin{array}{l} {S, H}_{\tilde{Θ}} (Q) = 〈 Q, [\frac{\partial S}{\partial Q}, \frac{\partial H}{\partial Q}] 〉 + 〈 Θ (\frac{\partial S}{\partial Q}), \frac{\partial H}{\partial Q} 〉 = 0 \\ {S, H}_{\tilde{Θ}} (Q) = 〈 Q, a d_{\frac{\partial S}{\partial Q}} \frac{\partial H}{\partial Q} 〉 + 〈 Θ (\frac{\partial S}{\partial Q}), \frac{\partial H}{\partial Q} 〉 = 0 \\ {S, H}_{\tilde{Θ}} (Q) = 〈 a d_{\frac{\partial S}{\partial Q}}^{*} Q, \frac{\partial H}{\partial Q} 〉 + 〈 Θ (\frac{\partial S}{\partial Q}), \frac{\partial H}{\partial Q} 〉 = 0 \\ \forall H, {S, H}_{\tilde{Θ}} (Q) = 〈 a d_{\frac{\partial S}{\partial Q}}^{*} Q + Θ (\frac{\partial S}{\partial Q}) +, \frac{\partial H}{\partial Q} 〉 = 0 \Rightarrow a d_{\frac{\partial S}{\partial Q}}^{*} Q + Θ (\frac{\partial S}{\partial Q}) = 0 \end{array}

(148)

We have found the generalized Casimir equation for Entropy in the non-null cohomology case:

{S, H}_{\tilde{Θ}} (Q) = 0

(149)

That could be also written:

a d_{\frac{\partial S}{\partial Q}}^{*} Q + Θ (\frac{\partial S}{\partial Q}) = 0

(150)

This equation was observed by Souriau in his paper of 1974, where he has written that geometric temperature

β

is a kernel of

{\tilde{Θ}}_{β}

, that is written:

β \in K e r {\tilde{Θ}}_{β} \Rightarrow 〈 Q, [β, Z] 〉 + \tilde{Θ} (β, Z) = 0

(151)

That we can develop to recover the Casimir equation:

\begin{array}{l} \Rightarrow 〈 Q, a d_{β} Z 〉 + \tilde{Θ} (β, Z) = 0 \Rightarrow 〈 a d_{β}^{*} Q, Z 〉 + \tilde{Θ} (β, Z) = 0 \\ β = \frac{\partial S}{\partial Q} \Rightarrow 〈 a d_{\frac{\partial S}{\partial Q}}^{*} Q, Z 〉 + \tilde{Θ} (\frac{\partial S}{\partial Q}, Z) = 〈 a d_{\frac{\partial S}{\partial Q}}^{*} Q + Θ (\frac{\partial S}{\partial Q}), Z 〉 = 0, \forall Z \\ \Rightarrow a d_{\frac{\partial S}{\partial Q}}^{*} Q + Θ (\frac{\partial S}{\partial Q}) = 0 \end{array}

(152)

Then the generalized Casimir Equation in non-null cohomogy is given by:

{(a d_{\frac{\partial S}{\partial Q}}^{*} Q)}_{j} + Θ {(\frac{\partial S}{\partial Q})}_{j} = C_{i j}^{k} a d_{{(\frac{\partial S}{\partial Q})}^{i}}^{*} Q_{k} + Θ_{j} = 0

(153)

Given a Hamiltonian

H : g^{*} \to R

, the equation of motion for

Q \in g^{*}

is:

\frac{d Q}{d t} = a d_{\frac{\partial H}{\partial Q}}^{*} Q + Θ (\frac{\partial H}{\partial Q}) with H = S \Rightarrow \frac{d Q}{d t} = a d_{\frac{\partial S}{\partial Q}}^{*} Q + Θ (\frac{\partial S}{\partial Q}) = 0

(154)

Level sets of the Casimir Entropy function, on which the coadjoint orbits lie, are symplectic manifolds.

8.3. Souriau Entropy Invariance in Coadjoint Representation

If we note

𝕬 𝖓 (g^{*})

the space of analytic function on the dual space of the Lie agebra

g^{*}

, a function

F^{*} \in 𝕬 𝖓 (g^{*})

is a Casimir invariant if for any

g \in G, X \in g^{*}

, we have

F^{*} (A d_{g}^{*} X) = F^{*} (X)

. We have observed previously that Souriau’s Entropy analytic function

S (Q)

defined on dual space of the Lie algebra

g^{*}

by Legendre transform of Massieu Characteric analytic function

Φ (β)

(minus logarithm of Laplace transform) defined on Lie algebra

g

was an invariant function under the affine coadjoint action

S [Q (A d_{g} (β))] = S [A d_{g}^{*} (Q) + θ (g)] = S (Q)

. In case of null-cohomology, Souriau cocycle cancels

θ (g) = 0

, and we recover Casimir invariant function in coadjoint representation

S [A d_{g}^{*} (Q)] = S (Q)

.

We can then observe that Souriau Entropy is an extended Casimir invariant function in case of non-null cohomogy. This characteristic of Souriau Entropy could be a new characterization of Entropy. In Souriau Lie groups Thermodynamics, Entropy $S (Q)$ is a generalized Casimir invariant function for coadjoint representation in case of non-null cohomology, and Massieu Characteristic function by Legendre duality is a generalized Casimir function for adjoint representation.

We will explain how to prove that Souriau Entropy is invariant under the action of the group, starting from its definition:

S (Q) = 〈 Q, β 〉 - Φ (β) with Q = \frac{\partial Φ (β)}{\partial β} \in g^{*} and β = \frac{\partial S (Q)}{\partial Q} \in g

(155)

with

Φ (β) = - \log \int_{M} e^{- 〈 U (ξ), β 〉} d λ_{ω} and U : M \to g^{*}

(156)

Considering Souriau Entropy

S (Q)

where the heat

Q = \frac{\partial Φ (β)}{\partial β} \in g^{*}

an element of the dual space of the Lie algebra is parameterized by

β \in g

an element of the Lie algebra, the Lie group

G

acts through

g \in G

by adjoint operator

A d_{g}

, the entropy is given by

S [Q (A d_{g} (β))]

with

Q (A d_{g} (β))

given by fundamental Souriau equation:

Q (A d_{g} (β)) = A d_{g}^{*} (Q) + θ (g)

(157)

The invariance of Souriau Entropy is deduced from the following developments:

\begin{array}{l} β \in g \to A d_{g} (β) \Rightarrow Ψ (A d_{g} (β)) = \int_{M} e^{- 〈 U, A d_{g} (β) 〉} d λ_{ω} \\ Ψ (A d_{g} (β)) = \int_{M} e^{- 〈 A d_{g^{- 1}}^{*} U, β 〉} d λ_{ω} = \int_{M} e^{- 〈 U (A d_{g^{- 1}} β) - θ (g^{- 1}), β 〉} d λ_{ω} \\ Ψ (A d_{g} (β)) = e^{〈 θ (g^{- 1}), β 〉} Ψ (β) \\ θ (g^{- 1}) = - A d_{g^{- 1}}^{*} θ (g) \Rightarrow Ψ (A d_{g} (β)) = e^{- 〈 A d_{g^{- 1}}^{*} θ (g), β 〉} Ψ (β) \\ Φ (β) = - \log Ψ (β) \Rightarrow Φ (A d_{g} (β)) = Φ (β) - 〈 θ (g^{- 1}), β 〉 = Φ (β) + 〈 A d_{g^{- 1}}^{*} θ (g), β 〉 \end{array}

(158)

Based on this expression of Massieu Characteristic function transform by action of the group, we can use Legendre transform to study how Souriau Entropy is changed:

\begin{array}{l} S (Q) = 〈 Q, β 〉 - Φ (β) \Rightarrow S (Q (A d_{g} β)) = 〈 Q (A d_{g} β), A d_{g} β 〉 - Φ (A d_{g} β) \\ {\begin{cases} Q (A d_{g} (β)) = A d_{g}^{*} (Q) + θ (g) \\ Φ (A d_{g} (β)) = - \log Ψ (A d_{g} (β)) = - 〈 θ (g^{- 1}), β 〉 + Φ (β) \end{cases} \\ \Rightarrow S (Q (A d_{g} β)) = 〈 A d_{g}^{*} (Q) + θ (g), A d_{g} β 〉 + 〈 θ (g^{- 1}), β 〉 - Φ (β) \\ \Rightarrow S (Q (A d_{g} β)) = 〈 A d_{g}^{*} (Q) + θ (g), A d_{g} β 〉 - 〈 A d_{g^{- 1}}^{*} θ (g), β 〉 - Φ (β) \\ \Rightarrow S (Q (A d_{g} β)) = 〈 A d_{g^{- 1}}^{*} A d_{g}^{*} (Q) + A d_{g^{- 1}}^{*} θ (g), β 〉 - 〈 A d_{g^{- 1}}^{*} θ (g), β 〉 - Φ (β) \\ A d_{g^{- 1}}^{*} A d_{g}^{*} (Q) = Q \Rightarrow S (Q (A d_{g} β)) = 〈 Q, β 〉 - Φ (β) = S (β) \end{array}

(159)

We finally prove that Souriau Entropy is invariant in coadjoint representation

S (A d_{g}^{*} (Q) + θ (g)) = S (β)

in general case of non-null cohomology, that we could write

S (A d_{g}^{#} (Q)) = S (β)

, if we note affine coadjoint action

A d_{g}^{#} (Q) = A d_{g}^{*} (Q) + θ (g)

. This is also true in case of null-cohomology when the Souriau cocycle cancels

θ (g) = 0

, and we recover classical generalized Casimir invariant function definition on coadjoint representation for Entropy

S (A d_{g}^{*} (Q)) = S (β)

generalized Casimir invariant function definition on adjoint representation for Massieu Characteristic function

Φ (A d_{g} (β)) = Φ (β)

.

8.4. Souriau Entropy Given by Casimir Invariant Functions Equations

Based on development given in the following we can state that:

As the Entropy

S

is a generalized Casimir invariant function in the coadjoint representation,

S (A d_{e^{t ξ}}^{*} h) = S (h)

, then

S

should be solution of the following differential equation:

C_{i j}^{k} Q_{k} \frac{\partial S (Q)}{\partial Q_{j}} = 0, i, j, k = \dim g, with {\begin{cases} C_{i j}^{k} Q_{k} = C_{i j} (Q) = B_{i j} \\ B_{Q} (x, y) = B_{i j} x^{i} y^{j} = 〈 Q, [x, y] 〉 \end{cases}

(160)

where

C_{i j}^{k}

is the structure tensor of the Lie algebra

g

in the basis

(e_{1}, e_{2}, \dots, e_{n})

, while

X_{k}

are the coordinates in

g^{*}

in the basis

(e^{1}, e^{2}, \dots, e^{n})

defined by

〈 e^{j}, e_{i} 〉 = δ_{i j}

. The structure tensor s given by

[ϕ (e_{i}), ϕ (e_{i})] = C_{i j}^{k} ϕ (e_{k})

with

ϕ (e_{i}) = C_{i j}^{k} X_{k} \frac{\partial}{\partial X_{j}}, i = 1, \dots, n

.

8.5. Characterization of Generalized Casimir Invariant Functions in Coadjoint Representation

We will describe recent characterization of generalized Casimir invariant functions by Oleg L. Kurnyavko and Igor V. Shirokov [72,73,75] who have proposed Algebraic method for construction of Casimir invariants of Lie groups coadjoint representations (see Appendix C). Modern invariant theory based on geometric methods, which was credited classically as non-constructive, has some exception admitting a constructive solution related to the constructing invariants of Lie groups representations.

Let

T

be a connected Lie group,

T (G)

a representation of the group

G

in the linear space

V

,

T_{g}

the operators associated to the representation of the group

G

on the linear space

V

, then the invariants are given by the following equation:

F (T_{g} x) = F (x), x \in V, g \in G, T_{g} \in T (G), F (x) \in C^{\infty} (V)

(161)

With the properties that:

T_{e} = I, T_{g_{a} g_{b}} = T_{g_{a}} T_{g_{b}}, T_{g^{- 1}} = {(T_{g})}^{- 1}

(162)

Solution is given by the following differential equation:

- \sum_{i, j}^{\dim V} t_{k j}^{i} x^{j} \frac{\partial F (x)}{\partial x^{i}} = 0 with t_{k j}^{i} = {\frac{\partial {(T_{g})}_{j}^{i}}{\partial g^{k}} |}_{g = e} and k = 1, \dots, \dim G

(163)

t_{k j}^{i}

are elements of the matrices of the Lie algebra representation basis of

G

.

That we can write

t_{k} = - t_{k j}^{i} x^{j} \frac{\partial}{\partial x^{i}}

and

t_{k} F (x) = 0

.

If we consider the dual space

V^{*}

, the co-tangent representation is given by:

〈 T^{*} (g) X, T (g) x 〉 = 〈 X, x 〉

(164)

And co-represnetation invariants are given by:

t_{k}^{*} F^{*} (X) = 0 with t_{k}^{*} = t_{k j}^{i} X_{i} \frac{\partial}{\partial X_{j}}

(165)

They have underlined the relationship between invariants of representations and conjugate representations, where the algebraic construction of Lie groups representations invariants are given by invariants of the conjugate representation with respect to the invariants of the original representation.

Shirokov Theorem 1.

Let

F (x)

be a non-degenerate invariant of the representation

T (G)

, then conjugate representation invariant can be found by Legrendre tranform:

F^{*} (X) = x^{i} X_{i} - F (x) = 〈 x, X 〉 - F (x) with X = \frac{\partial F (x)}{\partial x} such that X_{i} = \frac{\partial F (x)}{\partial x^{i}}

(166)

and also the converse problem:

F (x) = x^{i} X_{i} - F^{*} (X) = 〈 x, X 〉 - F^{*} (X) with x = \frac{\partial F^{*} (X)}{\partial X} such that x^{i} = \frac{\partial F^{*} (X)}{\partial X_{i}}

(167)

Shirokov has considered

F (x)

the representation invariant

T (G)

, and

F^{*} (X)

the representation invariant

T^{*} (G)

conjugate to

T (G)

, with the conditions:

- t_{k j}^{i} x^{j} \frac{\partial F (x)}{\partial x^{i}} = 0 and t_{l j}^{i} X_{i} \frac{\partial F^{*} (X)}{\partial X_{j}} = 0

(168)

\begin{array}{l} t_{l j}^{i} X_{i} \frac{\partial F^{*} (X)}{\partial X_{j}} = t_{l j}^{i} X_{i} \frac{\partial}{\partial X_{j}} [x^{k} (X) X_{k} - F (x (X))] = t_{l j}^{i} X_{i} \frac{\partial x^{k}}{\partial X_{j}} X_{k} + t_{l j}^{i} X_{i} x^{k} \frac{\partial X_{k}}{\partial X_{j}} - t_{l j}^{i} X_{i} \frac{\partial F (x)}{\partial x^{k}} \frac{\partial x^{k}}{\partial X_{j}} \\ t_{l j}^{i} X_{i} \frac{\partial F^{*} (X)}{\partial X_{j}} = t_{l j}^{i} X_{i} \frac{\partial x^{k}}{\partial X_{j}} \frac{\partial F (x)}{\partial x^{k}} + t_{l j}^{i} \frac{\partial F (x)}{\partial x^{i}} x^{k} δ_{k}^{j} - t_{l j}^{i} \frac{\partial F (x)}{\partial x^{i}} \frac{\partial F (x)}{\partial x^{k}} \frac{\partial x^{k}}{\partial X_{j}} \\ t_{l j}^{i} X_{i} \frac{\partial F^{*} (X)}{\partial X_{j}} = t_{l j}^{i} x_{j} \frac{\partial F (x)}{\partial x^{i}} = 0 \end{array}

(169)

Invariant Casimir Functions of the coadjoint representation has been studied for completely integrable Hamiltonian systems, as classical systems on the orbits of the coadjoint representation. Oleg L. Kurnyavko and Igor V. Shirokov have considered the relationship between invariants of representations of Lie groups and their conjugate dual representations.

Considering the coadjoint action given by:

〈 A d_{g}^{*} X, x 〉 = 〈 X, A d_{g^{- 1}} x 〉, g \in G, X \in g^{*}, x \in g

(170)

Invariants of a coadjoint representation are called Casimir functions, with the property:

F^{*} (A d_{g}^{*} X) = F^{*} (X)

(171)

the infinitesimal invariance is given by the equations:

C_{i j} (X) \frac{\partial F^{*} (X)}{\partial X_{j}} = 0 with C_{i j} (X) = C_{i j}^{k} X_{k}, i, j, k = \dim g

(172)

The number of functionally independent invariants is given by the rank of the matrix

C_{i j} (X)

, called the index of the Lie algebra

g

:

i n d g = \dim g^{*} - \sup_{X \in g^{*}} r a n k C_{i j} (X)

.

From these adjoint and coadjoint representation, Shirokov has introduced the following theorem:

Shirokov Theorem 2.

Let

F (A d_{g} x) = F (x)

be a non-degenerate invariant of the adjoint representation

A d_{G}

, then conjugate representation invariant, invariant of coadjoint representation

A d_{G}^{*}

can be found by formula:

F^{*} (X) = x^{i} X_{i} - F (x) = 〈 x, X 〉 - F (x) with X = \frac{\partial F (x)}{\partial x} such that X_{i} = \frac{\partial F (x)}{\partial x^{i}}

(173)

and also the converse problem, let

F^{*} (A d_{g}^{*} X) = F^{*} (X)

, invariant of coadjoint representation

A d_{G}

is given by:

F (x) = x^{i} X_{i} - F^{*} (X) = 〈 x, X 〉 - F^{*} (X) with x = \frac{\partial F^{*} (X)}{\partial X} such that x^{i} = \frac{\partial F^{*} (X)}{\partial X_{i}}

(174)

Nota:

C_{i j}^{k} X_{k} \frac{\partial F^{*} (X)}{\partial X_{j}} = 0, i, j, k = \dim g, with {\begin{cases} C_{i j}^{k} X_{k} = C_{i j} (X) = B_{i j} \\ B_{X} (x, y) = B_{i j} x^{i} y^{j} = 〈 X, [x, y] 〉 \end{cases}

(175)

8.6. Constructing Generalized Casimir Invariant Functions in Coadjoint Representation

I. V. Shirokov has proposed a method for constructing invariants of the coadjoint representation of Lie groups with an arbitrary dimension and structure based on local symplectic coordinates on the coadjoint orbits. Oleg L. Kurnyavko and Igor V. Shirokov have also proposed a general method for constructing Casimir invariants.

We will give some other developments of Casimir Invariant Functions by A.T. Fomenko and V.V. Trofimov, related to Orbits of the coadjoint representation and the associated canonical symplectic structure.

The coadjoint orbit

O_{h}

passing through the point

h \in g^{*}

is given by

O_{h} = {A d_{g}^{*} h / g \in G} where h \in g^{*}

(176)

\begin{matrix} T_{h} O_{h} = {a d_{ρ}^{*} h / ρ \in g} \subset g^{*}, h \in g^{*} \\ v = {\frac{d [A d_{\exp^{t ρ}}^{*} h]}{d t} |}_{t = 0} \in T_{h} O_{h}, ρ \in g \end{matrix}

(177)

\begin{array}{l} Let (e_{1}, \dots, e_{n}) basis of g, (e^{1}, \dots, e^{n}) basis of g^{*}, with 〈 e^{i}, e_{j} 〉 = δ_{i, j} \\ h = h_{i} e^{i} \Rightarrow v^{i} = {\frac{d [〈 h_{i}, A d_{\exp^{t ρ}}^{*} h 〉]}{d t} |}_{t = 0} = {\frac{d [〈 A d_{\exp^{t ρ}}^{*} h, e_{i} 〉]}{d t} |}_{t = 0} \\ v^{i} = {\frac{d [〈 h, A d_{\exp^{- t ρ}} e_{i} 〉]}{d t} |}_{t = 0} = 〈 h, {\frac{d [A d_{\exp^{- t ρ}} e_{i}]}{d t} |}_{t = 0} 〉 \\ v^{i} = 〈 h, - [ρ, e_{i}] 〉 = - 〈 a d_{ρ}^{*} h, e_{i} 〉 = 〈 v, e_{i} 〉 \Rightarrow v = - a d_{ρ}^{*} h \end{array}

(178)

Kirillov, Kostant and Souriau have introduced a KKS 2-form on co-adjoint co-orbits that then inherit a structure of homogeneous symplectic manifold:

\begin{array}{l} ξ, η \in T_{h} O_{h} = {a d_{χ}^{*} h / ρ \in g} \subset g^{*}, h \in g^{*} \\ ω_{h} (ξ, η) = ω (a d_{ξ_{1}}^{*} h, a d_{η_{1}}^{*} h) = 〈 h, [ξ_{1}, η_{1}] 〉 with ξ = a d_{ξ_{1}}^{*} h and η = a d_{η_{1}}^{*} h \end{array}

(179)

This KKS 2-form

ω

is invariant with respect to the coadjoint action

ω_{g} (A d_{f}^{*} ξ, A d_{f}^{*} η) = ω_{h} (ξ, η)

:

\begin{array}{l} ω_{g} (A d_{f}^{*} ξ, A d_{f}^{*} η) = ω_{g} (A d_{f}^{*} a d_{ξ_{1}}^{*} h, A d_{f}^{*} a d_{η_{1}}^{*} h) \\ with g = A d_{f}^{*} h, ξ = a d_{ξ_{1}}^{*} h, η = a d_{η_{1}}^{*} h and f \in G, g, h \in g^{*} \\ A d_{f}^{*} a d_{ξ_{1}}^{*} h = a d_{A d_{f} ξ_{1}}^{*} (A d_{f}^{*} h) and A d_{f}^{*} a d_{η_{1}}^{*} h = a d_{A d_{f} η_{1}}^{*} (A d_{f}^{*} h) \\ ω_{g} (A d_{f}^{*} ξ, A d_{f}^{*} η) = ω_{g} (a d_{A d_{f} ξ_{1}}^{*} (A d_{f}^{*} h), a d_{A d_{f} η_{1}}^{*} (A d_{f}^{*} h)) \\ ω_{g} (A d_{f}^{*} ξ, A d_{f}^{*} η) = ω_{g} (a d_{A d_{f} ξ_{1}}^{*} g, a d_{A d_{f} η_{1}}^{*} g) = 〈 g, [A d_{f} ξ_{1}, A d_{f} η_{1}] 〉 \\ ω_{g} (A d_{f}^{*} ξ, A d_{f}^{*} η) = 〈 g, A d_{f} [ξ_{1}, η_{1}] 〉 = 〈 A d_{f^{- 1}}^{*} g, [ξ_{1}, η_{1}] 〉 with h = A d_{f^{- 1}}^{*} g \\ ω_{g} (A d_{f}^{*} ξ, A d_{f}^{*} η) = 〈 h, [ξ_{1}, η_{1}] 〉 \\ ω_{g} (A d_{f}^{*} ξ, A d_{f}^{*} η) = ω_{h} (ξ, η) \end{array}

(180)

The symplectic structure is given due to the property that

d ω = 0

, that could be proved making link with Jacobi identity.

\begin{array}{l} Let g r a d_{s k e w} m such that ω (v, g r a d_{s k e w} m) = v (m) = \sum_{i} v^{i} \frac{\partial m}{\partial x^{i}} smooth vector field on M \\ {m, n} = ω (g r a d_{s k e w} m, g r a d_{s k e w} n) = \sum_{i < j} ω_{i j} {(g r a d_{s k e w} m)}^{i} {(g r a d_{s k e w} n)}^{j} \\ with {(g r a d_{s k e w} m)}^{i} = \sum_{j} ω^{i j} \frac{\partial m}{\partial x^{j}} \Rightarrow {m, n} = \sum_{i < j} ω^{i j} \frac{\partial m}{\partial x^{i}} \frac{\partial n}{\partial x^{j}} \end{array}

(181)

Jacobi identity can be computed:

\begin{array}{l} {m, {n, p}} = - (g r a d_{S k e w} m) {n, p} = - L_{g r a d_{S k e w} m} {n, p} with L_{ζ} : Lie derivative \\ If ξ = g r a d_{S k e w} m \\ L_{ξ} {n, p} = L_{ξ} (ω^{i j} \frac{\partial n}{\partial x^{i}} \frac{\partial p}{\partial x^{j}}) = L_{ξ} {(ω)}^{i j} \frac{\partial n}{\partial x^{i}} \frac{\partial p}{\partial x^{j}} + ω^{i j} \frac{\partial (ξ n)}{\partial x^{i}} \frac{\partial p}{\partial x^{j}} + ω^{i j} \frac{\partial n}{\partial x^{i}} \frac{\partial (ξ p)}{\partial x^{j}} \\ L_{ξ} {n, p} = L_{ξ} {(ω)}^{i j} \frac{\partial n}{\partial x^{i}} \frac{\partial p}{\partial x^{j}} + {ξ n, p} + {n, ξ p} = L_{ξ} {(ω)}^{i j} \frac{\partial n}{\partial x^{i}} \frac{\partial p}{\partial x^{j}} - {{m, n}, p} - {n, {m, p}} \\ \Rightarrow {m, {n, p}} + {{m, n}, p} + {n, {m, p}} = L_{ξ} {(ω)}^{i j} \frac{\partial n}{\partial x^{i}} \frac{\partial p}{\partial x^{j}} \end{array}

(182)

Using Elie Cartan formula

L_{ξ} ω = i (ξ) d ω + d i (ξ) ω

. If

ξ

is a Hamiltonian vector field,

d i (ξ) ω = 0

and then

L_{ξ} ω = i (ξ) d ω

. If

d ω = 0

, then the Jacobi identity is satisfied

{m, {n, p}} + {{m, n}, p} + {n, {m, p}} = 0

and conversely.

Let consider the Berezin Bracket:

\begin{array}{l} {m, n} = - C_{i j}^{k} x_{k} \frac{\partial m}{\partial x^{i}} \frac{\partial n}{\partial x^{j}} with [e_{i}, e_{j}] = C_{i j}^{k} e_{k} \\ where (e_{1}, e_{2}, \dots, e_{n}) basis of Lie algebra g, (e^{1}, e^{2}, \dots, e^{n}) basis of dual Lie algebra g^{*} \\ of corresponding coordinates x^{1}, \dots, x^{n} for g, x_{1}, \dots, x_{n} for g^{*} \end{array}

(183)

This Berezin Bracket is given by:

\begin{array}{l} {m, n}_{x} = d m_{x} (a d_{d n (x)}^{*} (x)) = (a d_{d n (x)}^{*} (x)) (d m_{x}) \\ {m, n}_{x} = 〈 x, [d n_{x}, d m_{x}] 〉 = C_{i j}^{k} x_{k} \frac{\partial n}{\partial x^{i}} \frac{\partial m}{\partial x^{j}} with d n_{x} = \frac{\partial n}{\partial x_{i}} e_{i}, d m_{x} = \frac{\partial m}{\partial x_{j}} e_{j} \end{array}

(184)

By developping Berezin Bracket

{m, n} = - C_{i j}^{k} x_{k} \frac{\partial m}{\partial x^{i}} \frac{\partial n}{\partial x^{j}} with [e_{i}, e_{j}] = C_{i j}^{k} e_{k}

, we can prove that the bracket verify jacoby identy

{m, {n, p}} + {{m, n}, p} + {n, {m, p}} = 0

and then

d ω = 0

.

We will see that differential equation for (semi-)invariants of the coadjoint representations could be established. We will note

𝕬 𝖓 (g^{*})

the space of analytic function on the dual space of the Lie agebra

g^{*}

. A function

F^{*} \in 𝕬 𝖓 (g^{*})

is an invariant if for any

g \in G, X \in g^{*}

, we have

F^{*} (A d_{g}^{*} X) = F^{*} (X)

, and is semi-invariant if

F^{*} (A d_{g}^{*} X) = χ (g) F^{*} (X)

where

χ (g)

is a character of the Lie group

G

.

We have a representation of Lie algebras

ϕ : g \to V e c (Γ)

defined on basis

(e_{1}, e_{2}, \dots, e_{n})

in

g

where

V e c (Γ)

is the space of vector fields on

Γ

an open subset in

g^{*}

, given by:

ϕ (e_{i}) = C_{i j}^{k} X_{k} \frac{\partial}{\partial X_{j}}, i = 1, \dots, n

(185)

where

C_{i j}^{k}

is the structure tensor of the Lie algebra

g

in the basis

(e_{1}, e_{2}, \dots, e_{n})

, while

X_{k}

are the coordinates in

g^{*}

in the basis

(e^{1}, e^{2}, \dots, e^{n})

defined by

〈 e^{j}, e_{i} 〉 = δ_{i j}

. The representation is not dependent of the choice of the basis, with the property:

[ϕ (e_{i}), ϕ (e_{i})] = C_{i j}^{k} ϕ (e_{k})

.

We have the property, that:

{\frac{d^{n} F^{*} (A d_{e^{t ξ}}^{*} h)}{d t^{n}} |}_{t = 0} = [{(- ϕ (ξ))}^{n} F^{*}] (h)

(186)

This result is obtained by the following development:

\begin{array}{l} {\frac{d F^{*} (A d_{e^{t ξ}}^{*} h)}{d t} |}_{t = 0} = \frac{\partial F^{*}}{\partial X_{i}} (h) {. \frac{d 〈 A d_{e^{t ξ}}^{*} h, X_{i} 〉}{d t} |}_{t = 0} = \frac{\partial F^{*}}{\partial X_{i}} (h) {. \frac{d 〈 h, A d_{e^{- t ξ}} e_{i} 〉}{d t} |}_{t = 0} \\ {\frac{d 〈 h, A d_{e^{- t ξ}} e_{i} 〉}{d t} |}_{t = 0} = 〈 h, - [ξ, e_{i}] 〉 = - 〈 h, C_{i j}^{k} ξ^{j} e_{k} 〉 with [ξ, e_{i}] = [ξ^{j} e_{j}, e_{i}] = C_{j i}^{k} ξ^{j} e_{k} \\ {\frac{d 〈 h, A d_{e^{- t ξ}} e_{i} 〉}{d t} |}_{t = 0} = 〈 h_{k} e^{k}, - C_{j i}^{k} ξ^{j} e_{k} 〉 = - C_{j i}^{k} ξ^{j} h_{k} \\ {\frac{d F^{*} (A d_{e^{t ξ}}^{*} h)}{d t} |}_{t = 0} = - C_{j i}^{k} ξ^{j} h_{k} \frac{\partial F^{*}}{\partial X_{i}} (ξ) = (- ϕ (ξ) F^{*}) (h) \end{array}

(187)

We use then Taylor expansion of

F^{*} (A d_{e^{t ξ}}^{*} h)

given by:

F^{*} (A d_{e^{t ξ}}^{*} h) = F^{*} (h) + \sum_{n = 1}^{\infty} \frac{{(- ϕ (ξ))}^{n} F^{*}}{n!} (h) . t^{n}

(188)

We can observe that

F^{*}

is invariant if

F^{*} (A d_{e^{t ξ}}^{*} h) = F^{*} (h)

and then

{(- ϕ (ξ))}^{n} F^{*} = 0

or

ϕ (ξ) F^{*} = 0

that could be written

C_{j i}^{k} ξ^{j} h_{k} \frac{\partial F^{*}}{\partial X_{i}} (ξ) = 0

.

If

F^{*}

is semi-invariant of the coadjoint representation of group if and only if:

ϕ (e_{i}) F^{*} = - λ_{i} F^{*} with λ_{i} = d χ (e_{i}) (d χ : derivative of χ at the group G identity element

\begin{array}{l} F^{*} (A d_{e^{t ξ}}^{*} h) = χ (e^{t ξ}) F^{*} (h) with χ (e^{t ξ}) = e^{t χ_{*} (ξ)} \\ [{(- ϕ (ξ))}^{n} F^{*}] (h) = χ_{*} (ξ) F^{*} (h) \\ \Rightarrow F^{*} (A d_{e^{t ξ}}^{*} h) = [1 + \sum_{n = 1}^{\infty} \frac{{[χ_{*} (ξ)]}^{n}}{n!} t^{n}] . F^{*} (h) \end{array}

(189)

9. Conclusion: Lie Groups Thermodynamics for Machine Learning

With Lie groups Thermodynamics, we have presented Souriau tools to extend Gibbs density for Lie groups [95,96,97,98,99,100,101,102,103,104,105,106,107]. We can make reference to other explorations of Lie Group Representation theory to built exponential families [108,109,110,111] or Information Geometry in Quantum Physics [112,113,114,115,116,117,118,119,120,121,122,123]. Gibbs density estimation is a basic tool in statistical macine learning. Classically, we can associate to any posterior distribution an effective generalized geometric temperature, given by an element of the dual space of the Lie algebra, relating it to the Gibbs prior distribution. Classification rules could be introduced by Gibbs measures defined on parameter sets and depending on the observed sample value. A Gibbs measure is a special kind of probability measure used in statistical mechanics to describe the state of a particle system driven by a given energy function at some given temperature. Gibbs measures will be realized as minimizers of the average loss value under entropy constraints. In this extension for Lie groups, an important tool is the log-Laplace transform related to the Massieu Characteristic Function in Thermodynamics (a re-parameterization of the free energy by Planck temperature preserving Legendre transform with respect to Entropy). As we want to deal with Lie group data for Machine Learning, we will consider tools very similar to those used in statistical mechanics to describe particle systems with many degrees of freedom. Classification rules could be described by Gibbs measures defined on parameter sets and depending on the observed sample value. Comparing any posterior distribution with a Gibbs prior distribution make it possible to provide a way to build an estimator which can be proved to reach adaptively at the best possible asymptotic error rate (by temperature selection of a Gibbs posterior distribution built within a single parametric model). Estimators derived from Gibbs posteriors show excellent performance in diverse tasks, such as classification, regression and ranking. The usual recommendation is to sample from a Gibbs posterior using MCMC (Markov chain Monte Carlo). With covariant Souriau Gibbs density, it is possible to extend MCMC and Gibbs sampler approach for Lie Groups Machine Learning.

More recently, the use of perturbation techniques has been proposed as an alternative to MCMC techniques for sampling. These results have been extended in conditional random fields loss, proving that the maximum in expectation with low-rank perturbations, provides an upperbound on the log partition (what we call Massieu characteristic function). New lower bounds on the partition function and new unbiased sequential sampler for the Gibbs distribution based on low-rank perturbations have been introduced. All these methods are based on sampling from the Gibbs distribution, upper-bounding the log partition function. All these results are synthetized in [124], where they also propose a new general method, with connections to the recently-proposed Fenchel-Young losses [125], using doubly stochastic scheme for minimization of these losses, for unsupervised and supervised learning. This is a generalization to the Gibbs distribution. Methods for learning parameters of a Gibbs distribution on data

{(y_{i})}_{i = 1, \dots, n}

are based on maximization of the likelihood:

{\hat{l}}_{n} (θ) = \frac{1}{n} \sum_{i = 1}^{n} \log p_{G i b b s, θ} (y_{i}) = \frac{1}{n} \sum_{i = 1}^{n} 〈 y_{i}, θ 〉 - \log ψ (θ)

(190)

that is optimized by gradients methods using the empirical log-likelihood, given by:

\nabla_{θ} {\hat{l}}_{n} (θ) = {\hat{y}}_{n} - E_{G i b b s, θ} [y]

(191)

For this method of moment-matching, the expectation of the Gibbs distribution is a challenge in some cases. This approach has been replaced by computing

p_{θ}

, with a method called “perturb-and-MAP” to learn the parameters in this model as a proxy for log-likelihood. This minimization is equivalent to maximizing previous equation by substituting the log-partition

\log ψ (θ)

with:

\begin{matrix} F_{ε} (θ) = E [F (θ + ε V)] = E [\underset{y \in C}{\max 〈 y, θ + ε V 〉}] with a random noise vector \\ ε V, ε > 0 \end{matrix}

(192)

This approach could be linked with the use of Fenchel-Young losses [125]. In the perturbed model, the Fenchel-Young loss is given by:

L_{ε} (θ; y) = F_{ε} (θ) + ε Ω (y) - 〈 θ, y 〉 = D_{ε Ω} (y, {\hat{y}}_{ε}^{*} (θ))

(193)

with loss gradient

\nabla L_{ε} (θ; y) = \nabla F_{ε} (θ) - y = y_{ε}^{*} (θ) - y

where

y_{ε}^{*} (θ) = E_{p_{θ} (y)} [y] = E [\arg \max_{y \in C} 〈 y, θ + ε V 〉]

and

D_{ε Ω} (y, {\hat{y}}_{ε}^{*} (θ))

Bregman divergence associated to

ε Ω

. As

F_{ε}

generalizes the log-sum-exp function on the simplex, its dual

Ω

is a generalization of the negative entropy (which is the Fenchel dual of log-sum-exp).These connections have been studied in [126].

To conclude, we have seen that Lie group tools based on Representation Theory and Orbits Methods could be used with Souriau-Fisher Metric on Coadjoint Orbits as an extension of Fisher Metric for Lie group through homogeneous Symplectic Manifolds on Lie group Co-Adjoint Orbits.

We can then beneficiate of different tools based on Souriau Lie groups Thermodynamics and Kirillov Representation Theory, as illustrated in Figure 6, for:

Supervised Machine Learning

⚬: Geodesic Natural Gradient on Lie Algebra: Extension of Neural Network Natural Gradient from Information Geometry on Lie Algebra for Lie Groups Machine Learning.
⚬: Souriau Maximum Entropy Density on Co-Adjoint Orbits: Covariant Maximum Entropy Probability Density for Lie groups defined with Souriau Moment Map, Co-Adjoint Orbits Method & Kirillov Representation Theory
⚬: Symplectic Integrator preserving Moment Map: Extension of Neural Network Natural Gradient to Geometric Integrators as Symplectic integrators that preserve moment map

Non-Supervised Machine Learning

⚬: Souriau Exponential Map on Lie Algebra: Exponential Map for Geodesic Natural Gradient on Lie Algebra based on Souriau Algorithm for Matrix Characteristic Polynomial
⚬: Fréchet Geodesic Barycenter by Hermann Karcher Flow: Extension of Mean/Median on Lie group by Fréchet Definition of Geodesic Barycenter on Souriau-Fisher Metric Space, solved by Karcher Flow.
⚬: Mean-Shift on Lie groups with Souriau-Fisher Distance: Extension of Mean-Shift for Homogeneous Symplectic Manifold and Souriau-Fisher Metric Space.

[There is nothing more in physical theories than symmetry groups except the mathematical construction which allows precisely to show that there is nothing more] « Il n’y a rien de plus dans les théories physiques que les groupes de symétrie si ce n’est la construction mathématique qui permet précisément de montrer qu’il n’y a rien de plus ».
Jean-Marie Souriau (see Figure 7)

La notion classique d’ensemble canonique de Gibbs est étendue au cas d’une variété symplectique sur laquelle un groupe de Lie possède une action symplectique (“groupe dynamique”). La définition rigoureuse donnée ici permet d’étendre un certain nombre de propriétés thermodynamiques classiques (la température est ici un élément de l’algèbre de Lie du groupe, la chaleur un élément de son dual), notamment des inégalités de convexité. Dans le cas de groupes non commutatifs, des propriétés particulières apparaissent: la symétrie est spontanément brisée, certaines relations de type cohomologique sont vérifiées dans l’algèbre de Lie du groupe [The classical notion of Gibbs’ canonical ensemble is extended to the case of a symplectic manifold on which a Lie group has a symplectic action (“dynamic group”). The rigorous definition given here makes it possible to extend a certain number of classical thermodynamic properties (the temperature here is an element of the Lie group algebra, heat an element of its dual), notably inequalities of convexity. In the case of non-commutative groups, particular properties appear: the symmetry is spontaneously broken, certain relations of cohomological type are verified in the Lie algebra of the group].
Jean-Marie Souriau, Mécanique Statistique, Groupes de Lie et Cosmologie, colloque CNRS n°237 – Géométrie Symplectique et physique mathématique

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Coadjoint orbits and Moment Map for SU(1,1)

We give more details developpements to obtain SU(1,1)/K coadjoint orbit and moment map from [127]. SU(1,1) has been intensitively studied for Coherent states in Quantum Physics [128].

If we consider hyperbolic Group

S L (2, R)

S L (2, R) = {(\begin{matrix} m & p \\ q & n \end{matrix}) \in G L (2, R) / m n - p q = 1}

(A1)

Elements of

S L (2, R)

could be written with Iwasawa decomposition:

g = k_{θ} . a_{t} . n_{b} with k_{θ} \in K, a_{t} \in A, n_{b} \in N

(A2)

\begin{array}{l} K = {k_{θ} = (\begin{matrix} \cos θ & - \sin θ \\ \sin θ & \cos θ \end{matrix}) / 0 \leq θ < 2 π} with e^{- i θ} = \frac{m - i q}{\sqrt{m^{2} + q^{2}}} \\ A = {a_{t} = (\begin{matrix} e^{t} & 0 \\ 0 & e^{- t} \end{matrix}) / t \in R} with e^{2 t} = m^{2} + q^{2} \\ N = {n_{b} = (\begin{matrix} 1 & b \\ 0 & 1 \end{matrix}) / b \in R} with b = \frac{m p + q n}{\sqrt{m^{2} + q^{2}}} \end{array}

(A3)

K is a maximal compact sub-group of

G_{S L}

.

Appendix A.1. Group of Unit Disk Automorphisms

Consider

S U (1, 1)

sub-group of

S L (2, R)

given by

S U (1, 1) = {A = (\begin{matrix} a & b \\ b^{*} & a^{*} \end{matrix}) / \det (A) = {| a |}^{2} - {| b |}^{2} = 1, a, b \in R}

(A4)

The following interior automorphism that transforms

S L (2, R)

to

S U (1, 1)

, inducing an isomorphism between them:

\begin{array}{l} S L (2, R) \to S U (1, 1) \\ g \mapsto C g C^{- 1} with C = \frac{1}{\sqrt{2}} (\begin{matrix} 1 & - i \\ 1 & i \end{matrix}) \end{array}

(A5)

S U (1, 1)

acts on the Poincaré unit disk

D = {z \in C / | z | < 1}

:

g^{- 1} . z = \frac{a z + b}{b^{*} + a^{*}} with g^{- 1} = (\begin{matrix} a & b \\ b^{*} & a^{*} \end{matrix}) \in S U (1, 1) and z \in D

(A6)

Valentine Bargmann has parameterized

S U (1, 1)

:

{\begin{cases} γ = \frac{b}{a} \\ ω = \arg (α) \mod 2 π \end{cases} \Rightarrow {\begin{cases} | γ | < 1 \\ a = e^{i ω} {(1 - {| γ |}^{2})}^{- 1 / 2} \\ b = e^{i ω} γ {(1 - {| γ |}^{2})}^{- 1 / 2} \end{cases}

(A7)

Then

S U (1, 1) = {(γ, ω) / | γ | < 1, ω \in] - π, + π]}

with Group composition law given by:

\begin{array}{l} (γ, ω) . (γ^{'}, ω^{'}) = (γ^{″}, ω^{″}) \\ {\begin{cases} γ^{″} = [γ^{'} + γ e^{- 2 i ω^{″}}] {[1 + γ γ^{*^{'}} e^{2 i ω^{'}}]}^{- 1} \\ ω^{″} = ω + ω^{'} + \frac{1}{2 i} \log ([1 + γ^{*^{'}} γ e^{- 2 i ω^{'}}] {[1 + γ^{'} γ^{*} e^{- 2 i ω^{'}}]}^{- 1}) \mod 2 π \end{cases} \\ g = (γ, ω) \Rightarrow g^{- 1} = (- γ e^{2 i ω}, - ω) \end{array}

(A8)

S U (1, 1)

is topological product of unit disk and circle.

Appendix A.2. Universal covering of $S L (2, R)$

If we consider

G = {(γ, ω) / | γ | < 1, ω \in R}

, the following mapping:

\begin{array}{l} Θ : G \to S U (1, 1) \\ Θ (γ, ω) = (γ, ω \mod 2 π) \end{array} with K e r Θ = {(0, 2 k π) / k \in Z}

(A9)

Topological product of unit Disk

D

dans

C

and real straight line

R

is the universal covering of

S U (1, 1)

.

A maximal compact subgroup of

S U (1, 1)

is

C K C^{- 1} = {(\begin{matrix} e^{- i θ} & 0 \\ 0 & e^{i θ} \end{matrix}) / θ \in R}

and the subgroup for

G

is

Θ^{- 1} (C K C^{- 1}) = {(0, θ) / θ \in R}

.

Pukanszky and Sally have defined irreductible unitary representation of

S \tilde{L} (2, R)

, classified in principal serie, discrete serie and complemantary serie.

The Lie algebra

g

of

G

and

S U (1, 1)

is given by:

l_{1} = \frac{1}{2} (\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix}), l_{2} = \frac{1}{2} (\begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix}), l_{0} = \frac{1}{2} (\begin{matrix} 0 & - 1 \\ 1 & 0 \end{matrix})

(A10)

with the commutation relation:

[l_{0}, l_{1}] = l_{2}, [l_{1}, l_{2}] = - l_{0}, [l_{2}, l_{0}] = l_{1}

.

The dual space of the Lie algebra

g^{*}

of

g

to

g

thanks to Killing form:

B (\sum_{i = 1, 2, 0} x_{i} l_{i}, \sum_{i = 1, 2, 0} y_{i} l_{i}) = 2 (x_{1} y_{1} + x_{2} y_{2} - x_{0} y_{0})

(A11)

Appendix A.3. Coadjointes Orbits of $S U (1, 1)$

Considering adjoint representation

A d_{g} : g \to g

, and coadjointe representation as transpose linear mapping of

A d_{g^{- 1}}

, written by

A d_{g}^{*} = {(A d_{g^{- 1}})}^{*}

,

A d_{g}^{*} : g^{*} \to g^{*}

.

Let

f = (η, 0, 0)

in base

{l_{1}, l_{2}, l_{0}}

with

η > 0

coadjoint orbit of

f

is:

\begin{array}{l} O_{η} = {g f g^{- 1} / g = (\begin{matrix} a & c \\ b & d \end{matrix}) \in S L (2, R)} \\ O_{η} = {η (1 + 2 b c), η (b d - a c), η (b d + a c) / a, b, c, d \in R with a d - b c = 1} \end{array}

(A12)

Stabilizer of

f

is

G_{1} (f) = {g \in G / g . f = f} = A \cup {- A} with A = {a_{t} = (\begin{matrix} e^{t} & 0 \\ 0 & e^{- t} \end{matrix}) / t \in R}

(A13)

S L (2, R) = K^{'} N A^{'} with A^{'} = A \cup {- A}

where

N = {n_{b} = (\begin{matrix} 1 & b \\ 0 & 1 \end{matrix}) / b \in R}

and

K^{'} = {k_{θ} = (\begin{matrix} \cos θ & - \sin θ \\ \sin θ & \cos θ \end{matrix}) / 0 \leq θ < π}

and as

S L (2, R) / G_{1} (f)

is a bijection with

O_{η}

, then

O_{η}

is in bijection with

K^{'} N

, diffeomorph to

K / {I_{d}} \times N

. Then all element

x \in O_{η}

can be written through this bijection

x = k_{θ} n_{b}, k_{θ} \in K^{'}, n_{b} \in N

.

O_{η}

is set of points

l = (x_{1}, x_{2}, x_{0}) \in g^{*}

such that:

x_{1}^{2} + x_{2}^{2} - x_{0}^{2} = η^{2} > 0

a one sheet hyperboloid in

g^{*}

.

We will study discrete sequence.

Appendix A.4. Quantization of Kähler Manifold

Let

(M, ω)

a Kalherian manifold of dimension 2n (complex manifold with complex structure J) with 2-form

g (X, Y) = ω (X, J Y)

, a Riemannian structure on

M

.

We have seen that

S U (1, 1)

is conjugate of

S L_{2} (R)

in

S L (2, R)

\begin{array}{l} (\begin{matrix} α & β \\ β^{*} & α^{*} \end{matrix}) \in S U (1, 1), (\begin{matrix} a & b \\ c & d \end{matrix}) \in S L_{2} (R) \\ (\begin{matrix} α & β \\ β^{*} & α^{*} \end{matrix}) = \frac{- i}{2} (\begin{matrix} 1 & - i \\ 1 & i \end{matrix}) (\begin{matrix} a & b \\ c & d \end{matrix}) (\begin{matrix} i & i \\ - 1 & 1 \end{matrix}) \end{array}

(A14)

{\begin{cases} α = \frac{1}{2} [(a + d) + i (b - c)] \\ β = \frac{1}{2} [(a - d) - i (b + c)] \end{cases} \Rightarrow {\begin{cases} α + α^{*} = a + d \\ α - α^{*} = i (b - c) \\ β + β^{*} = a - d \\ β - β^{*} = - i (b + c) \end{cases} \Rightarrow {\begin{cases} a = \frac{1}{2} [(α + α^{*}) + (β + β^{*})] \\ b = \frac{1}{2 i} [(α - α^{*}) - (β - β^{*})] \\ c = \frac{- 1}{2 i} [(α - α^{*}) + (β - β^{*})] \\ d = \frac{1}{2} [(α + α^{*}) - (β + β^{*})] \end{cases}

(A15)

Let

f = (0, 0, h - \frac{1}{2})

in base

{l_{1}, l_{2}, l_{0}}

where

l_{1} = \frac{1}{2} (\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix}), l_{2} = \frac{1}{2} (\begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix}), l_{0} = \frac{1}{2} (\begin{matrix} 0 & - 1 \\ 1 & 0 \end{matrix})

with

h > \frac{1}{2}

(respectively

f = (0, 0, h + \frac{1}{2})

with

h < - \frac{1}{2}

), the coadjoint orbit of

f

is:

\begin{array}{l} O_{h} = {g f g^{- 1} / g = (\begin{matrix} a & c \\ b & d \end{matrix}) \in S L (2, R)} \\ O_{h} = {(| h | - \frac{1}{2}) sign h (a c + b d); \frac{(| h | - \frac{1}{2}) sign h}{2} (d^{2} + c^{2} - (a^{2} + b^{2})); \frac{(| h | - \frac{1}{2}) sign h}{2} (d^{2} + c^{2} + a^{2} + b^{2})} \end{array}

(A16)

O_{h}

appears as points

l = (x_{1}, x_{2}, x_{0}) \in g^{*}

such that:

x_{1}^{2} + x_{2}^{2} - x_{0}^{2} = - {({| h |}^{2} - \frac{1}{2})}^{2} < 0 with {\begin{cases} x_{0} > 0 if h > \frac{1}{2} \\ x_{0} < 0 if h < - \frac{1}{2} \end{cases}

(A17)

O_{h}

is one of hyperboloid sheets of

g^{*}

, associated to representation

π_{h}

of discrete serie of

G

.

{\begin{cases} x_{1} = h^{'} (a c + b d) \\ x_{2} = \frac{h^{'}}{2} (d^{2} - a^{2} + c^{2} - b^{2}) \\ x_{0} = \frac{h^{'}}{2} (d^{2} + a^{2} + c^{2} + b^{2}) \end{cases} with h^{'} = (| h | - \frac{1}{2}) sign h

(A18)

Then, we obtain:

{\begin{cases} x_{1} = i h^{'} (α β - α^{*} β^{*}) \\ x_{2} = - h^{'} (α β + α^{*} β^{*}) \\ x_{0} = h^{'} (α α^{*} + β β^{*}) \end{cases}

(A19)

If we set:

{\begin{cases} α = \frac{e^{i θ}}{\sqrt{1 - {| z |}^{2}}} \\ β = \frac{z e^{- i θ}}{\sqrt{1 - {| z |}^{2}}} \end{cases} \Rightarrow {\begin{cases} x_{1} = \frac{i h^{'} (z - z^{*})}{1 - {| z |}^{2}} \\ x_{1} = \frac{- h^{'} (z + z^{*})}{1 - {| z |}^{2}} \\ x_{0} = \frac{h^{'} (1 + {| z |}^{2})}{1 - {| z |}^{2}} \end{cases}

(A20)

We use the parametrization of

O_{h}

by unit disk

D

:

\begin{array}{l} D = {z \in C / | z | < 1} \to O_{h} \\ z \mapsto (x_{1}, x_{2}, x_{0}) \end{array}

(A21)

{\begin{cases} x_{1} = - 2 (| h | - \frac{1}{2}) sign h \frac{Im (z)}{1 - {| z |}^{2}} \\ x_{1} = - 2 (| h | - \frac{1}{2}) sign h \frac{Re (z)}{1 - {| z |}^{2}} \\ x_{0} = (| h | - \frac{1}{2}) sign h \frac{1 + {| z |}^{2}}{1 - {| z |}^{2}} \end{cases}

(A22)

This parametrisation provides a kahlerian structure for

O_{h}

, inherit from

D

.

We have

O_{h} = G / \tilde{K} = S L (2, R) / K

, and as

S L (2, R)

is isomorph to

S U (1, 1)

,

O_{h}

is identified with

S U (1, 1) / K_{0} = D

. Stabilizer of the origin is

K_{0} = {(\begin{matrix} α & 0 \\ 0 & α^{*} \end{matrix}) / | α | = 1}

. Then

O_{h}

is globally diffeomorph to

D

for all

h

.

We can compute Liouville measure on

O_{h}

. This measure is

ω_{h} = \frac{ω_{h}^{'}}{2 π}

where

ω_{h}^{'}

is the canonical 2-form on orbit. Parameterization on

O_{h}

gives:

ω_{h} = \frac{ω_{h}^{'}}{2 π} = \frac{- i (2 | h | - 1)}{2 π (1 - {| z |}^{2})} sign h {| d z |}^{2}

(A23)

We can observe that:

{\begin{cases} d x_{1} = \frac{i}{(1 - {| z |}^{2})} [(z - z^{*}) (1 - z z^{*}) d h^{'} + h^{'} (1 - z^{* 2}) d z - h^{'} (1 - z^{2}) d z^{*}] \\ d x_{2} = \frac{- 1}{(1 - {| z |}^{2})} [(z + z^{*}) (1 - z z^{*}) d h^{'} + h^{'} (1 + z^{* 2}) d z - h^{'} (1 + z^{2}) d z^{*}] \\ d x_{0} = \frac{1}{(1 - {| z |}^{2})} [(1 + z z^{*}) (1 - z z^{*}) d h^{'} + 2 z^{*} h^{'} d z - 2 z h^{'} d z^{*}] \end{cases}

(A24)

d x_{1} d x_{2} d x_{0} = \frac{- i {h^{'}}^{2} (1 - {| z |}^{2})}{{(1 - {| z |}^{2})}^{6}} \det (\begin{matrix} z - z^{*} & 1 - {| z |}^{2} & - (1 - z^{2}) \\ z + z^{*} & 1 + z^{* 2} & 1 + z^{2} \\ 1 + {| z |}^{2} & 2 z^{*} & 2 z \end{matrix})

(A25)

d x_{1} d x_{2} d x_{0} = \frac{- 2 i {h^{'}}^{2}}{(1 - {| z |}^{2})} d h^{'} {| d z |}^{2} = \frac{2}{π} h^{'} d h^{'} ω_{h}

(A26)

This the measure defined on open set of

g^{*}

given by

l_{1}^{2} + l_{2}^{2} - l_{0}^{2} < 0

.

Appendix B. Bargman Parameterization of SU(1,1)

S U (1, 1)

is isomorphic to

S L (2, R) = S p (2, R)

through the complex unitary matrix

W

:

S L (2, R) = {g = (\begin{matrix} a & b \\ c & d \end{matrix}) / \det g = a d - b c = 1}

(A27)

S p (2, R) = {g = (\begin{matrix} a & b \\ c & d \end{matrix}) / g J g^{T} = J, J = (\begin{matrix} 0 & + 1 \\ - 1 & 0 \end{matrix})}

(A28)

W = \frac{1}{\sqrt{2}} (\begin{matrix} ω^{- 1} & ω^{- 1} \\ - ω & ω \end{matrix}) = {(W^{+})}^{- 1} with ω = e^{i π / 4} = \frac{1}{\sqrt{2}} (1 + i)

(A29)

If we observe that

W^{- 1} J W = - i M

, the isomorphism is given explicitely by:

(\begin{matrix} a & b \\ c & d \end{matrix}) = g (u) = W u W^{- 1} = (\begin{matrix} Re (α + β) & - Im (α - β) \\ Im (α + β) & Re (α - β) \end{matrix})

(A30)

(\begin{matrix} α & β \\ β^{*} & α^{*} \end{matrix}) = u (g) = W^{- 1} g W = \frac{1}{2} (\begin{matrix} (a + d) - i (b - c) & (a - d) + i (b + c) \\ (a - d) - i (b + c) & (a + d) + i (b - c) \end{matrix})

(A31)

We can also make also a link with

S O (2, 1)

of “1 + 2” pseudo-orthogonal matrices:

S O (2, 1) = {Γ \in G L (3, 3) / \det (Γ) = 1, Γ K Γ^{T} = Γ, K = (\begin{matrix} + 1 & 0 & 0 \\ 0 & - 1 & 0 \\ 0 & 0 & - 1 \end{matrix})}

(A32)

Γ (g) = (\begin{matrix} \frac{1}{2} (a^{2} + b^{2} + c^{2} + d^{2}) & \frac{1}{2} (a^{2} - b^{2} + c^{2} - d^{2}) & - c d - a b \\ \frac{1}{2} (a^{2} + b^{2} - c^{2} - d^{2}) & \frac{1}{2} (a^{2} - b^{2} - c^{2} + d^{2}) & c d - a b \\ - b d - a c & b d - a c & a d + b c \end{matrix})

(A33)

with

Γ (g_{1}) Γ (g_{2}) = Γ (g_{1} g_{2}), Γ (I) = I, Γ (g^{- 1}) = Γ {(g)}^{- 1}

The

S O (2, 1)

matrix corresponds to any

S U (1, 1)

:

Γ (u) = (\begin{matrix} {| α |}^{2} + {| β |}^{2} & 2 Re α β^{*} & 2 Im α β^{*} \\ 2 Re α β & Re (α^{2} + β^{2}) & Im (α^{2} - β^{2}) \\ - 2 Im α β & - Im (α^{2} + β^{2}) & Re (α^{2} - β^{2}) \end{matrix})

(A34)

and

α = \pm \sqrt{\frac{1}{2} (Γ_{11} + Γ_{12}) + i (Γ_{12} - Γ_{21})}, β = \frac{1}{2 α} (Γ_{10} - i Γ_{20})

The properties of connectivity of

S p (2, R)

is described by its isomorphy with

S U (1, 1)

. Using unimodular condition:

{| α |}^{2} - {| β |}^{2} = 1 \Rightarrow α_{R}^{2} + α_{I}^{2} - β_{R}^{2} = 1 + β_{I}^{2} \geq 1 with α = α_{R} + i α_{I} and β = β_{R} + i β_{I}

If

β_{I}

is fixed,

(α_{R}, α_{I}, β_{R})

are constrained to define a one-sheeted revolution hyperboloid, with its circular waist in the

α

plane.

To

S U (1, 1)

, we can associate the simply-connected universal covering group, using the maximal compact subgroup

U (1)

and corresponding to the Iwasawa decomposition (factorization of a noncompact semisimple group into its maximal compact subgroup times a solvable subgroup).

(\begin{matrix} α & β \\ β^{*} & α^{*} \end{matrix}) = (\begin{matrix} e^{i ω} & 0 \\ 0 & e^{i ω} \end{matrix}) (\begin{matrix} λ & μ \\ μ^{*} & λ \end{matrix}) with {\begin{cases} ω = \arg α = \frac{1}{2} i \ln (α^{*} α^{- 1}) \\ λ = | α | > 0 \\ μ = e^{- i ω} β = \sqrt{\frac{α^{*}}{α}} β \end{cases}

(A35)

β = e^{i ω} μ, {| α |}^{2} - {| β |}^{2} = λ^{2} - {| μ |}^{2} = 1 so | μ | < λ

(A36)

Bargmann has generalized this parameterization for

S p (2 N, R)

, more convenient but difficult to generalize to N dimensions. For

S U (1, 1)

:, Bargmann has used

(ω, γ)

:

γ = \frac{μ}{λ} = \frac{β}{α} (| γ | < 1), λ = \frac{1}{\sqrt{1 - {| γ |}^{2}}}, μ = \frac{γ}{\sqrt{1 - {| γ |}^{2}}}

(A37)

For

S L (2, R) = S p (2, R)

, the Bargman, parameterization is given by this decomposition of a non-singular matrix into the product of an orthogonal and a positive definite symmetric matrix:

(\begin{matrix} a & b \\ c & d \end{matrix}) = (\begin{matrix} \cos ω & - \sin ω \\ \sin ω & \cos ω \end{matrix}) (\begin{matrix} λ + Re μ & Im μ \\ Im μ & λ - Re μ \end{matrix})

(A38)

Conversely:

ω = \arg [(a + d) - i (b - c)], μ = e^{- i ω} [(a - d) + i (b + c)]

ω

is counted modulo

2 π

,

ω \equiv ω (\mod 2 π)

.

S U (1, 1)

and

S L (2, R) = S p (2, R)

are described when

ω

is counted modulo

2 π

,

ω \equiv ω (\mod 2 π)

.

Valentine Bargmann has proposed the covering of the general symplectic group

S p (2 N, R)

:

S p (2 N, R) = {g = (\begin{matrix} A & B \\ C & D \end{matrix}) / g J_{2 N} g^{T} = J_{2 N}, J_{2 N}^{T} = - J_{2 N}, J_{2 N} = (\begin{matrix} 0 & I_{N} \\ - I_{N} & 0 \end{matrix})}

(A39)

with relations:

A B^{T} = B A^{T}, A C^{T} = C A^{T}, B D^{T} = D B^{T}, C D^{T} = D C^{T}, A D^{T} - B C^{T} = I_{N}

(A40)

g \in S p (2 N, R) \Rightarrow g^{- 1} = M_{2 N} g^{T} M_{2 N} = (\begin{matrix} D^{T} & - B^{T} \\ - C^{T} & A^{T} \end{matrix})

(A41)

Bargmann has observed that although

S p (2 N, R)

is not isomorphic to any pseudo-unitary group, its inclusion in

U (N, N)

will display the connectivity properties through its unitary

U (N)

maximal compact subgroup, generalizing the role of

U (1) = S O (2)

in

S p (2, R)

.

W_{N} = W \otimes I_{N}

a

2 N \times 2 N

matrix where

W = W_{1} = \frac{1}{\sqrt{2}} (\begin{matrix} ω_{π / 4}^{- 1} & ω_{π / 4}^{- 1} \\ - ω_{π / 4} & ω_{π / 4} \end{matrix})

with

ω = e^{i π / 4} = \frac{1}{\sqrt{2}} (1 + i)

, which gives the

N \times N

block coefficients.

u (g) = W_{N}^{- 1} g W_{N} = \frac{1}{2} (\begin{matrix} [A + D] - i [B - C] & [A - D] + i [B + C] \\ [A - D] - i [B + C] & [A + D] + i [B - C] \end{matrix}) = (\begin{matrix} α & β \\ β^{*} & α^{*} \end{matrix})

(A42)

with

{\begin{cases} α α^{+} - β β^{+} = I_{N}, α^{+} α - β^{T} β^{*} = I_{N} \\ α β^{T} - β α^{T} = 0, α^{T} β^{*} - β^{+} α = 0 \end{cases}

(A43)

and

u^{- 1} = M_{2 N} u^{+} M_{2 N}^{- 1} = (\begin{matrix} α^{+} & - β^{T} \\ - β^{+} & α^{T} \end{matrix})

(A44)

The symplecticity property of

g

becomes:

u M_{2 N} u^{+} = M_{2 N}, M_{2 N} = i W_{N}^{- 1} J_{2 N} W_{N} = (\begin{matrix} I_{N} & 0 \\ 0 & - I_{N} \end{matrix})

(A45)

(\begin{matrix} A & B \\ C & D \end{matrix}) = g (u) = W_{N} u W_{N}^{- 1} = (\begin{matrix} Re (α + β) & - Im (α - β) \\ Im (α + β) & Re (α - β) \end{matrix})

(A46)

Valentine Bargmann has extended the well-know theorem that any real matrix

R

may be decomposed into the product of an orthogonal

Q

and a symmetric positive definite matrix

S

, uniquely as

R = Q S

. Bargmann has shown that if

R \in S p (2 N, R)

, then

R = Q S

with

Q, S \in S p (2 N, R)

where

Q

maps onto unitary matrix and

S

maps onto Hermitian positive definite matrix:

u (Q) = (\begin{matrix} α & 0 \\ 0 & α^{*} \end{matrix}), α α^{+} = I_{N}, α \in U (N) and u (S) = \exp (\begin{matrix} 0 & ξ \\ ξ^{*} & 0 \end{matrix}), ξ = ξ^{T}

(A47)

We can generalize Bargmann parameterization of

S U (1, 1)

to

S p (2 N, R)

:

u {ω, λ, μ} = (\begin{matrix} e^{i ω} I_{N} & 0 \\ 0 & e^{- i ω} I_{N} \end{matrix}) (\begin{matrix} λ & μ \\ μ^{*} & λ^{*} \end{matrix}) \oplus, \det λ > 0

(A48)

Then the Bargmann parameters are:

ω = \frac{1}{N} \arg \det α, λ = e^{- i ω} α, μ = e^{- i ω} β, e^{i N ω} = \frac{\det α}{| \det α |}, \det λ = | \det α | > 0

(A49)

The

S p (2 N, R)

matrices in terms of the Bargmann parameters are:

g {ω, λ, μ} = (\begin{matrix} \cos ω I_{N} & - \sin ω I_{N} \\ \sin ω I_{N} & \cos ω I_{N} \end{matrix}) (\begin{matrix} Re (λ + μ) & - Im (λ - μ) \\ Im (λ + μ) & Re (λ - μ) \end{matrix})

(A50)

V. Bargmann has proposed the covering of the general symplectic group

S p (2 N, R)

:

S p (2 N, R) = {g = (\begin{matrix} A & B \\ C & D \end{matrix}) / g J_{2 N} g^{T} = J_{2 N}, J_{2 N}^{T} = - J_{2 N}, J_{2 N} = (\begin{matrix} 0 & I_{N} \\ - I_{N} & 0 \end{matrix})}

(A51)

A B^{T} = B A^{T}, A C^{T} = C A^{T}, B D^{T} = D B^{T}, C D^{T} = D C^{T}, A D^{T} - B C^{T} = I_{N}

(A52)

Bargmann has observed that although

S p (2 N, R)

is not isomorphic to any pseudo-unitary group, its inclusion in

U (N, N)

will display the connectivity properties through its unitary

U (N)

maximal compact subgroup, generalizing the role of

U (1) = S O (2)

in

S p (2, R)

:

W_{N} = W \otimes I_{N}

, 2 N \times 2 N matrix

where

W = W_{1} = \frac{1}{\sqrt{2}} (\begin{matrix} ω_{π / 4}^{- 1} & ω_{π / 4}^{- 1} \\ - ω_{π / 4} & ω_{π / 4} \end{matrix})

with

ω = e^{i π / 4} = \frac{1}{\sqrt{2}} (1 + i)

.

u (g) = W_{N}^{- 1} g W_{N} = \frac{1}{2} (\begin{matrix} [A + D] - i [B - C] & [A - D] + i [B + C] \\ [A - D] - i [B + C] & [A + D] + i [B - C] \end{matrix}) = (\begin{matrix} α & β \\ β^{*} & α^{*} \end{matrix})

(A53)

with

α α^{+} - β β^{+} = I_{N}, α^{+} α - β^{T} β^{*} = I_{N} and α β^{T} - β α^{T} = 0, α^{T} β^{*} - β^{+} α = 0

(A54)

The symplecticity property of

g

becomes:

u M_{2 N} u^{+} = M_{2 N}, M_{2 N} = i W_{N}^{- 1} J_{2 N} W_{N} = (\begin{matrix} I_{N} & 0 \\ 0 & - I_{N} \end{matrix})

(A55)

(\begin{matrix} A & B \\ C & D \end{matrix}) = g (u) = W_{N} u W_{N}^{- 1} = (\begin{matrix} Re (α + β) & - Im (α - β) \\ Im (α + β) & Re (α - β) \end{matrix})

(A56)

Appendix C. Shirokov Method to build Casimir function

I. V. Shirokov [71,72,73,74,75] has proposed a method for constructing invariants of the coadjoint representation of Lie groups with an arbitrary dimension and structure based on local symplectic coordinates on the coadjoint orbits. They also have made link with construction of the Lie algebra polarization

B_{λ} (X, Y)

be the skewsymmetric form on

g

defined by:

\begin{array}{l} B_{λ} (X, Y) = 〈 λ, [X, Y] 〉, X, Y \in g, λ \in g^{*} \\ with B_{λ} (X, Y) = B_{i j} X^{i} Y^{j}, B_{i j} = C_{i j}^{k} λ_{k} \equiv C_{i j} (λ), i, j, k = \dim g \end{array}

(A57)

Ker B_{λ} = {X \in g / B_{λ} (X, g) = 0} \equiv g^{λ} annihilator of λ \in g^{*}

(A58)

The polarization

𝖍 \equiv 𝕻 (g)

of the the covector

λ

is given by:

B_{λ} (𝖍, 𝖍) = 0, \dim 𝖍 = \frac{1}{2} (\dim g + \dim g^{λ})

(A59)

The polarization of a semisimple Lie algebra is the construction of polarization of the reductive Lie algebra

g

, where

g = C \oplus B

, where

C

is the center in

g

, and

B

is a semisimple subalgebra, given by:

𝖍 \equiv 𝕻 (C) \oplus 𝕻 (B) = C \oplus 𝕻 (B)

(A60)

Let

g

be an arbitrary Lie algebra,

R

its radical and

R^{⊥} = {X \in g / B_{λ} (X, R) = 0}

the orthogonal complement of

R

in

g

with respect to the form

B_{λ} (X, Y)

, then the polarization is given by:

𝕻 (g) \equiv 𝕻 (R) \oplus 𝕻 (R^{⊥}) = ℜ (R) \oplus 𝕻 (R^{⊥})

(A61)

Oleg L. Kurnyavko and Igor V. Shirokov idea of the method of constructing coadjoint invariants is to construct the canonical transition to the Darboux coordinates on the orbits of the dual Lie algebra

g^{*}

of maximal dimension dual to the Lie algebra

g

of the Lie group

G

.

If we consider a Lie group

G

, its Lie algebra

g

and its dua Lie algebra

g^{*}

, and let each element

f \in g^{*}

corresponds to the coordinates

(f_{1}, f_{2}, \dots, f_{n})

where

n = \dim g^{*}

. The action of the group

G

on

g^{*}

splits the dual Lie algebra into orbits

O_{λ}

where

λ \in g^{*}

is some element of the given orbit. As with Souriau Theorem, these orbits is a symplectic manifold, with Darboux theorem, there is a special coordinates

(p, q)

on them such that the corresponding symplectic form can be written:

ω_{λ} = d p_{i} \land d q^{i}, i = 1, \dots, \frac{1}{2} \dim O_{λ}

(A62)

The transition to canonical coordinates is determined by the functions:

f_{j} = f_{j} (q, p, λ), j = 1, 2, \dots, \dim g^{*}, with f_{j} (0, 0, λ) = λ_{j}

(A63)

Eliminating the variables q and p in these transition functions, we deduce the equation of the orbit:

ψ_{k} (f) = ω_{k}, k = 1, \dots, n - \dim O_{λ}

(A64)

We obtain

n - \dim O_{λ}

functions such that:

ψ_{k} (A d_{g}^{*} f) = ψ_{k} (f)

(A65)

Each nondegenerate orbit is determined by the set of numbers

ω = {ω_{1}, ω_{2}, \dots, ω_{i n d g}}

, that is fixed by the choice of the covector

λ

such that rank

C_{i j} (λ)

is maximal, and then:

ω_{k} = ψ_{k} (λ_{1}) = ψ_{k} (λ_{2}), if λ_{1}, λ_{2} \in O_{λ}

(A66)

We can then consider

λ (j)

orbits

O_{λ}

with

j = (j_{1}, j_{2}, \dots, j_{i n d g})

, with the constraint:

ψ_{k} (λ (j)) = ω_{k} (j), \det \frac{\partial ω_{k}}{\partial j} \neq 0

(A67)

If we set

λ_{i} (j) = a_{i}^{l} j_{l}

and

X_{k}^{i}

the components of the vectors that form the basis of the annihilator of the covector

λ (j)

such that:

\det \frac{\partial ω_{k} (j)}{\partial j_{k}} = \det \frac{\partial ψ_{k}}{\partial f_{i}} {\frac{\partial f_{i} (j)}{\partial j_{k}} |}_{f = λ (j)} = \det X_{k}^{i} \frac{\partial f_{i} (j)}{\partial j_{k}} = \det X_{k}^{i} (λ (j)) a_{i}^{k} \neq 0

(A68)

Finally, we obtain the parametrization, considering with Darboux coordiantes:

f_{i} = f_{i} (q, p, j), i = 1, \dots, i n d g^{*}

(A69)

By eliminating the Darboux coordinate

(q, p)

in previous equation, we have new relations:

φ_{k} (f, j) = 0 and j_{k} = j_{k} (f), k = 1, \dots, i n d g

(A70)

These relations provide invariants of the coadjoint representation of the Lie group

G

.

Igor V. Shirokov has proved that the desired linear transition to the Darboux coordinates on a nondegenerate orbit

O_{λ}

is defined by

f_{i} (p, q) = ξ_{i}^{a} (q) p_{a} + ξ_{i}^{A} (0, q) λ_{A} where g = {l_{A}, l_{a}} with l_{A} polarization basis of g

(A71)

References

Duhem, P. L’intégrale des forces vives en thermodynamique. JMPA 1898, 4, 5–19. [Google Scholar]
Duhem, P. Sur l’équation des forces vives en thermodynamique et les relations de la thermodynamique avec la mécanique classique 23 December 1897. In Procès-verbaux des Séances de la Société des Sciences Physiques et Naturelles de Bordeaux; PVSScPhNB (1897-98): Bordeaux, France, 1898; pp. 23–27. [Google Scholar]
Duhem, P. Sur deux Inégalites Fondamentales de la Thermodynamique, CR 156; Gauthier-Villars: Paris, France, 1913; pp. 421–425. [Google Scholar]
Duhem, P. Traité d’Énergetique ou Thermodynamique Générale. Tome Conservation de l’Énergie. Mécanique Rationelle. Statique Générale. Déplacement de l’Équilibre—Tome II. Dynamique Générale. Conductibilité de la Chaleur. Stabilité de l’Équilibre; GauthierVillars: Paris, France, 1911; pp. 504–528. [Google Scholar]
Ollivier, Y.; Maceau-Caron, G. Practical Riemannian Neural Networks. arxiv 2016, arXiv:abs/1602.08007. [Google Scholar]
Ollivier, Y. Riemannian metrics for neural networks I: Feedforward networks. Inf. Inference J. IMA 2015, 4, 108–153. [Google Scholar] [CrossRef]
Marceau-Caron, G.; Ollivier, Y.; Nielsen, F.; Barbaresco, F. Natural Langevin Dynamics for Neural Networks. In Computer Vision; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 2017; Volume 10589, pp. 451–459. [Google Scholar]
Ollivier, Y. True Asymptotic Natural Gradient Optimization. arXiv 2017, arXiv:1712.08449. [Google Scholar]
Amari, S.; Karakida, R.; Oizumi, M. Fisher information and natural gradient learning in random deep networks. In Proceedings of the AISTATS 2019, Okinawa, Japan, 16–18 April 2019. [Google Scholar]
Zhang, G.; Martens, J.; Grosse, R. Fast convergence of natural gradient descent for over parametrized neural networks. arXiv 2019, arXiv:1905.10961v1. [Google Scholar]
Amari, S.-I. Any Target Function Exists in a Neighborhood of Any Sufficiently Wide Random Network: A Geometrical Perspective. arXiv preprint 2020, arXiv:2001.06931. [Google Scholar] [CrossRef]
Kirillov, A.A. Elements of the Theory of Representations; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 1976; Volume 220. [Google Scholar]
Kosmann-Schwarzbach, Y. Groupes et Symétries Groupes Finis, Groupes et Algèbres de Lie, Representations; Ecole Polytechnique: Palaiseau, France, 2006. [Google Scholar]
Kosmann-Schwarzbach, Y. Siméon-Denis Poisson, Les Mathématiques au Service de la Science; Ecole Polytechnique: Palaiseau, France, 2013. [Google Scholar]
Laurent-Gengoux, C.; Pichereau, A.; Vanhaecke, P. Poisson Structures, Grundlehren der Mathematischen Wissenschaften; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Cartier, P. Groupoïdes de Lie et leurs algébroïdes. Séminaire BOURBAKI. ASTÉRISQUE. 2009, 326. Available online: http://preprints.ihes.fr/2008/M/M-08-20.pdf (accessed on 6 June 2020).
Barbaresco, F.; Gay-Balmaz, F. Lie Group Cohomology and (Multi)Symplectic Integrators: New Geometric Tools for Lie Group Machine Learning Based on Souriau Geometric Statistical Mechanics. Entropy 2020, 22, 498. [Google Scholar] [CrossRef]
HYPERBOLIC DEEP LEARNING: A Nascent and Promising Field. Available online: http://hyperbolicdeeplearning.com/ (accessed on 2 June 2020).
Fakhri, H. Su(1, 1)-Barut–Girardello coherent states for Landau levels. J. Phys. A Math. Gen. 2004, 37, 5203–5210. [Google Scholar] [CrossRef]
Gazeau, J.-P. Coherent States in Quantum Optics: An Oriented Overview. arXiv preprint 2018, arXiv:1810.06473. [Google Scholar]
Citti, G.; Sarti, A. A Cortical Based Model of Perceptual Completion in the Roto-Translation Space. J. Math. Imaging Vis. 2006, 24, 307–326. [Google Scholar] [CrossRef]
Miolane, N. Geometric Statistics for Computational Anatomy. Ph.D. Thesis, INRIA Sophia Antipolis, Valbonne, France, Stanford University, Stanford, CA, USA, 2016. [Google Scholar]
Verblunsky, S. On Positive Harmonic Functions: A Contribution to the Algebra of Fourier Series. Proc. Lond. Math. Soc. 1935, 38, 125–157. [Google Scholar] [CrossRef]
Verblunsky, S. On positive harmonic functions (second paper). Proc. Lond. Math. Soc. 1936, 40, 290–320. [Google Scholar] [CrossRef]
Trench, W.F. An Algorithm for the Inversion of Finite Toeplitz Matrices. J. Soc. Ind. Appl. Math. 1964, 12, 515–522. [Google Scholar] [CrossRef]
Barfoot, T.D.; Furgale, P. Associating Uncertainty with Three-Dimensional Poses for Use in Estimation Problems. IEEE Trans. Robot. 2014, 30, 679–693. [Google Scholar] [CrossRef]
Barrau, A.; Bonnabel, S. Intrinsic Filtering on Lie Groups With Applications to Attitude Estimation. IEEE Trans. Autom. Control 2014, 60, 436–449. [Google Scholar] [CrossRef]
Chirikjian, G.S. Stochastic Models, Information Theory, and Lie Groups, Volume 1: Classical Results and Geometric Methods. Applied and Numerical Harmonic Analysis; Birkhäuser: Basel, Switzerland, 2009. [Google Scholar] [CrossRef]
Gregory, S.C. Stochastic Models, Information Theory, and Lie Groups, Volume 2: Analytic Methods and Modern Applications; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
Gromov, M. In a Search for a Structure, Part 1: On Entropy. In Proceedings of the European Congress of Mathematics, Kraków, Poland, 2–7 July 2012; pp. 51–78. [Google Scholar]
Gromov, M. Six Lectures on Probabiliy, Symmetry, Linearity. Jussieu 2014, preprint. [Google Scholar]
Arnaudon, A.; De Castro, A.L.; Holm, D. Noise and Dissipation on Coadjoint Orbits. J. Nonlinear Sci. 2017, 28, 91–145. [Google Scholar] [CrossRef]
Benenti, S.; Tulczyjew, W.M. Cocycles of the coadjoint representation of a Lie group interpreted as differential forms. Mem. Accad. Sci. Torino 1986, 10, 117–138. [Google Scholar]
Benenti, S.; Tulczyjew, W.M. A geometrical interpretation of the 1-cocycles of a Lie group. In Geometrodynamics; Prastaro, A., Ed.; World Scientific Publishing Co.: Singapore, 1985; pp. 3–24. [Google Scholar]
Nencka, H.; Streater, R.F. INFORMATION GEOMETRY FOR SOME LIE ALGEBRAS. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 1999, 2, 441–460. [Google Scholar] [CrossRef]
Souriau, J.-M. Mécanique statistique, groupes de Lie et cosmologie. In Géométrie symplectique et physique mathématique; Éditions du C.N.R.S: Aix-en-Provence, France, 1974; pp. 59–113. [Google Scholar]
Davis, M.S. (Under the Direction of Francois Ziegler); Homogeneous Symplectic Manifolds of the Galilei Group; Georgia Southern University: Statesboro, GA, USA, 2012. [Google Scholar]
Marle, C.M. Géométrie Symplectique et Géométrie de Poisson; Calvage & Mounet: Paris, France, 2018. [Google Scholar]
Vandebogert, K. Notes on Symplectic Geometry; University of South Carolina: Columbia, SC, USA, 3 September 2017. [Google Scholar]
Cartier, P. Some Fundamental Techniques in the Theory of Integrable Systems, IHES/M/94/23, SW9421. 1994. Available online: https://cds.cern.ch/record/263222/files/P00023319.pdf (accessed on 31 May 2020).
Bismut, J.-M.; Labourie, F. Symplectic geometry and the Verlinde Formulas. Surv. Differ. Geom. 1999, 5, 97–311. [Google Scholar] [CrossRef]
Koszul, J.-L.; Zou, Y.M. Introduction to Symplectic Geometry; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
Knapp, A.W. Representation Theory of Semisimple Groups; Walter de Gruyter GmbH: Berlin, Germany, 1986. [Google Scholar]
Frenkel, I.B. Orbital theory for affine Lie algebras. Inven. Math. 1984, 77, 301–352. [Google Scholar] [CrossRef]
Libine, M. Introduction to Representations of Real Semisimple Lie Groups. arXiv 2014, arXiv:1212.2578v2. [Google Scholar]
Clerc, J.-L.; Ørsted, B. The Maslov index revisited. Transform. Groups 2001, 6, 303–320. [Google Scholar] [CrossRef]
Foth, P.; Lamb, M. The Poisson geometry of SU(1,1). J. Math. Phys. 2010, 51, 092701. [Google Scholar] [CrossRef]
Perelomov, A.M. Coherent states for arbitrary Lie group. Commun. Math. Phys. 1972, 26, 222–236. [Google Scholar] [CrossRef]
Hashimoto, T.; Ogura, K.; Okamoto, K.; Sawae, R.; Yasunaga, Y. Kirillov-Konstant theory and Feynman path integrals on coadjoint orbits I. Hokkaido Math. J. 1991, 20, 353–405. [Google Scholar] [CrossRef]
Hashimoto, T.; Ogura, K.; Okamoto, K.; Sawae, R. Kirillov Konstant theory and Feynman path integrals on coadjoint orbits of SU(2) and SU(1, 1). Int. J. Mod Phys. 1992, A7 (Suppl. 1A), 377–390. [Google Scholar] [CrossRef]
Hashimoto, T.; Ogura, K.; Okamoto, K.; Sawae, R. Borel-Weil theory and Feynman path integrals on flag manifolds. Hiroshima Math. J. 1993, 23, 231–247. [Google Scholar] [CrossRef]
Hashimoto, T. KirillovKonstant theory and Feynman path integrals on coadjoint orbits of a certain real semisimple Lie group. Hiroshima Math. J. 1993, 23, 607–627. [Google Scholar] [CrossRef]
Cishahayo, C.; De Bièvre, S. On the contraction of the discrete series of SU(1,1). Ann. Inst. Fourier 1993, 43, 551–567. [Google Scholar] [CrossRef]
Cahen, B. Contraction de SU(1,1) vers le Groupe de Heisenberg. Travaux Mathématiques 2004, XV, 19–43. [Google Scholar]
Cahen, M.; Gutt, S.; Rawnsley, J. Quantization on Kähler manifolds I, Geometric interpretation of Berezin quantization. J. Geom. Phys. 1990, 7, 45–62. [Google Scholar]
Marle, C.-M. Projection Stéréographique et Moments, Hal-02157930, Version 1. June 2019. Available online: https://hal.archives-ouvertes.fr/hal-02157930/ (accessed on 31 May 2020).
Guichardet, A. La methode des orbites: Historiques, principes, résultats. In Leçons de Mathématiques d’Aujourd’hui; Cassini: Paris, France, 2010; Volume 4, pp. 33–59. [Google Scholar]
Vergne, M. Representations of Lie Groups and the Orbit Method. In Emmy Noether in Bryn Mawr; Springer: Berlin/Heidelberg, Germany, 1983; pp. 59–101. [Google Scholar] [CrossRef]
Duflo, M.; Heckman, G.; Vergne, M. Projection d’orbites, formule de Kirillov et formule de Blattner. Mémoires Société Mathématique Fr. 1984, 1, 65–128. [Google Scholar] [CrossRef][Green Version]
Pukanszky, L. The Plancherel formula for the universal covering group of SL(R, 2). Math. Ann. 1964, 156, 96–143. [Google Scholar] [CrossRef]
Pukanszky, L. Leçons sur les Représentations des Groupes, Monographies de la Société Mathématique de France; Dunod: Paris, France, 1967. [Google Scholar]
Bernat, P. Représentations des Groupes de Lie, Monographie de la Société Mathématique de France; Dunod: Paris, France, 1972. [Google Scholar]
Dixmier, J. Enveloping Algebras; American Mathematical Society: Providence, RI, USA, 1996. [Google Scholar]
Duflo, M. Construction De Representations Unitaires D’un Groupe De Lie. In Harmonic Analysis and Group Representation; Talamanca, A.F., Ed.; C.I.M.E. Summer Schools 1980; Springer: Berlin/Heidelberg, Germany, 2010; Volume 82. [Google Scholar]
Guichardet, A. Théorie de Mackey et méthode des orbites selon M. Duflo. Expo. Math. 1985, 3, 303–346. [Google Scholar]
Mnemné, R.; Testard, F. Groupes de Lie Classiques; Hermann: Paris, France, 1985. [Google Scholar]
Rais, M. Orbites Coadjointes et Représentations des Groupes, Cours; C.I.M.P.A.: Blagnac/Toulouse, France, 1980. [Google Scholar]
Rais, M. La représentation coadjointe du groupe affine. Ann. Inst. Fourier 1978, 28, 207–237. [Google Scholar] [CrossRef]
De Micheli, E. On the Connection between Spherical Laplace Transform and Non-Euclidean Fourier Analysis. Mathematics 2020, 8, 287. [Google Scholar] [CrossRef]
Marsden, J.E.; Misiolek, G.; Ortega, J.-P.; Perlmutter, M.; Ratiu, T.S. Hamiltonian Reduction by Stages; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
Kurnyavko, O.L.I.; Shirokov, V. Algebraic method for construction of infinitesimal invariants of Lie groups representations. arXiv 2017, arXiv:1710.07977. [Google Scholar]
Kurnyavko, O.L.; Shirokov, I.V. Construction of invariants of the coadjoint representation of Lie groups using linear algebra methods. Theor. Math. Phys. 2016, 188, 965–979. [Google Scholar] [CrossRef]
Shirokov, I.V. Differential invariants of the transformation group of a homogeneous space. Sib. Math. J. 2007, 48, 1127–1140. [Google Scholar] [CrossRef]
Goncharovskii, M.M.; Shirokov, I.V. Differential invariants and operators of invariant differentiation of the projectable action of Lie groups. Theor. Math. Phys. 2015, 183, 619–636. [Google Scholar] [CrossRef]
Shirokov, I.V. Darboux coordinates onK-orbits and the spectra of Casimir operators on lie groups. Theor. Math. Phys. 2000, 123, 754–767. [Google Scholar] [CrossRef]
Holm, D. Variational principles for stochastic fluid dynamics. Proc. R. Soc. A Math. Phys. Eng. Sci. 2015, 471, 20140963. [Google Scholar] [CrossRef] [PubMed]
Gay-Balmaz, F.; Holm, D. Selective decay by Casimir dissipation in inviscid fluids. Nonlinearity 2013, 26, 495–524. [Google Scholar] [CrossRef]
Gay-Balmaz, F.; Holm, D. A geometric theory of selective decay with applications in MHD. Nonlinearity 2014, 27, 1747–1777. [Google Scholar] [CrossRef][Green Version]
Casimir, H.G.B. Uber die konstruktion einer zu den irreduziblen darstellungen halbeinfacher kontinuierlicher gruppen gehörigen differentialgleichung. Proc. R. Soc. Amst. 1931, 34, 844–846. [Google Scholar]
Racah, G. Sulla caratterizzazione delle rappresentazioni irriducibili dei gruppi semisemplici di Lie. Atti Accad. Naz. Lincei. Rend. Cl. Sci. Fis. Mat. Nat. 1950, 8, 108–112. [Google Scholar]
Oller, P.J. Oller Moving frames and differential invariants in centro-affine geometry. Lobachevskii J. Math. 2010, 31, 77–89. [Google Scholar] [CrossRef]
Berezin, D.V. Invariants of the co-adjoint representation for Lie algebras of a special form. Russ. Math. Surv. 1996, 51, 137–139. [Google Scholar] [CrossRef]
Abellanas, L. A general setting for Casimir invariants. J. Math. Phys. 1975, 16, 1580. [Google Scholar] [CrossRef]
Beltrametti, E.; Blasi, A. On the number of Casimir operators associated with any lie group. Phys. Lett. 1966, 20, 62–64. [Google Scholar] [CrossRef]
Pecina-Cruz, J.N. An algorithm to calculate the invariants of any Lie algebra. J. Math. Phys. 1994, 35, 3146–3162. [Google Scholar] [CrossRef]
Dixmier, J. Algèbres enveloppantes (Cahiers Scientifiques. No. 37); Gauthier-Villars: Paris, France, 1974. [Google Scholar]
Mikheyev, V.V.; Shirokov, I.V. Application of coadjoint orbits in the thermodynamics of non-compact manifolds. Electron. J. Theor. Phys. 2005, 2, 1–10. [Google Scholar]
Mikheyev, V.; Nielsen, F.; Barbaresco, F. Method of Orbits of Co-Associated Representation in Thermodynamics of the Lie Non-compact Groups. In Applications of Evolutionary Computation; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 2017; Volume 10589, pp. 425–431. [Google Scholar]
Fomenko, A.T.; Trofimov, V.V. Integrable Systems on Lie Algebras and Symmetric Spaces; Gordon and Breach Science Publishers: New York, NY, USA, 1988. [Google Scholar]
Trofimov, V.V. Introduction to Geometry of Manifolds with Symmetry; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 1994. [Google Scholar]
Machon, T. The Godbillon–Vey invariant as a restricted Casimir of three-dimensional ideal fluids. J. Phys. A Math. Theor. 2020, 53, 235701. [Google Scholar] [CrossRef]
Casimir, H.B.G. On Onsager’s Principle of Microscopic Reversibility. Rev. Mod. Phys. 1945, 17, 343. [Google Scholar] [CrossRef]
Thiffeault, J.-L.; Morrison, P. Classification and Casimir invariants of Lie–Poisson brackets. Phys. D Nonlinear Phenom. 2000, 136, 205–244. [Google Scholar] [CrossRef]
Arnaudon, A.; De Castro, A.L.; Holm, D.D. Noise and dissipation in rigid body motion. arXiv 2016, arXiv:1606.06308v1. [Google Scholar]
Bargmann, V. Irreducible Unitary Representations of the Lorentz Group. Ann. Math. 1947, 48, 568. [Google Scholar] [CrossRef]
Souriau, J.-M. Mécanique Statistique, Groupes de Lie et Cosmologie. 2020. Available online: https://www.academia.edu/42630654/Statistical_Mechanics_Lie_Group_and_Cosmology_1_st_part_Symplectic_Model_of_Statistical_Mechanics (accessed on 20 April 2020).
Souriau, J.-M. Structure des Systèmes Dynamiques; Dunod: Paris, France, 1969. [Google Scholar]
Souriau, J.M. Mécanique Classique et Géométrie Symplectique, Rapport CNRS CPT-84/PE.1695; Université de Provence et Centre de Physique Théorique CNRS: Marseille, France, 1984. [Google Scholar]
Souriau, J.M. Equations Canoniques et Géométrie Symplectique. Pub. Sci. Univ. Alger. Sér. A 1954, 1, 239–265. [Google Scholar]
Souriau, J.M. Géométrie de l’Espace des Phases, Calcul des Variations et Mécanique Quantique, Tirage Ronéotypé; Faculté des Sciences: Marseille, France, 1965. [Google Scholar]
Souriau, J.-M. Realisations d’algèbres de Lie au moyen de variables dynamiques. Il Nuovo Cim. A 1967, 49, 197–198. [Google Scholar] [CrossRef]
Marle, C.-M. From Tools in Symplectic and Poisson Geometry to J.-M. Souriau’s Theories of Statistical Mechanics and Thermodynamics. Entropy 2016, 18, 370. [Google Scholar] [CrossRef]
Barbaresco, F. Higher Order Geometric Theory of Information and Heat Based on Poly-Symplectic Geometry of Souriau Lie Groups Thermodynamics and Their Contextures: The Bedrock for Lie Group Machine Learning. Entropy 2018, 20, 840. [Google Scholar] [CrossRef]
Barbaresco, F. Souriau Exponential Map Algorithm for Machine Learning on Matrix Lie Groups; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 2019; pp. 85–95. [Google Scholar]
Barbaresco, F. Geometric Theory of Heat from Souriau Lie Groups Thermodynamics and Koszul Hessian Geometry: Applications in Information Geometry for Exponential Families. Entropy 2016, 18, 386. [Google Scholar] [CrossRef]
Barbaresco, F. Lie Group Machine Learning and Gibbs Density on Poincaré Unit Disk from Souriau Lie Groups Thermodynamics and SU(1,1) Coadjoint Orbits. In Applications of Evolutionary Computation; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 2019; Volume 11712, pp. 157–170. [Google Scholar]
Barbaresco, F. Application exponentielle de matrice par l’extension de l’algorithme de Jean-Marie Souriau, utilisable pour le tir géodésique et l’apprentissage machine pour les groupes de Lie. In Proceedings of the Colloque GRETSI 2019, Lille, France, 26–29 August 2019. [Google Scholar]
Ishi, H.; Kolodziejek, B. Characterization of the Riesz Exponential Familly on Homogeneous Cones. arXiv 2018, arXiv:1605.03896. [Google Scholar]
Tojo, K.; Yoshino, T. On a Method to Construct Exponential Families by Representation Theory. arXiv 2018, arXiv:1811.01394. [Google Scholar]
Tojo, K.; Yoshino, T. On a Method to Construct Exponential Families by Representation Theory. In Applications of Evolutionary Computation; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 2019; pp. 147–156. [Google Scholar]
Tojo, K.; Yoshino, T. Harmonic exponential families on homogeneous spaces. preprint. 2020. [Google Scholar]
Arnold, V. Sur la géométrie différentielle des groupes de Lie de dimension infinie et ses applications à l’hydrodynamique des fluides parfaits. Ann. Inst. Fourier 1966, 16, 319–361. [Google Scholar] [CrossRef]
Arnold, V.I.; Givental, A.B. Symplectic Geometry. In Dynamical Systems IV: Symplectic Geometry and Its Applications; Encyclopaedia of Mathematical Sciences; Arnol’d, V.I., Novikov, S.P., Eds.; Springer: Berlin, Gemany, 1990; Volume 4, pp. 1–136. [Google Scholar]
Balian, R.; Alhassid, Y.; Reinhardt, H. Dissipation in many-body systems: A geometric approach based on information theory. Phys. Rep. 1986, 131, 1–146. [Google Scholar] [CrossRef]
Balian, R.; Balazs, N. Equiprobability, inference, and entropy in quantum theory. Ann. Phys. 1987, 179, 97–144. [Google Scholar] [CrossRef]
Balian, R. On the principles of quantum mechanics and the reduction of the wave packet. Am. J. Phys. 1989, 57, 1019. [Google Scholar] [CrossRef]
Balian, R. From Microphysics to Macrophysics; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 1991; Volume 1–2. [Google Scholar]
Balian, R. Incomplete descriptions and relevant entropies. Am. J. Phys. 1999, 67, 1078–1090. [Google Scholar] [CrossRef]
Balian, R.; Valentin, P. Hamiltonian structure of thermodynamics with gauge. Eur. Phys. J. B 2001, 21, 269–282. [Google Scholar] [CrossRef]
Balian, R. Entropy, a Protean Concept. In Poincaré Seminar 2003; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 2004; pp. 119–144. [Google Scholar]
Balian, R. Information in statistical physics. Stud. Hist. Philos. Sci. Part B Stud. Hist. Philos. Mod. Phys. 2005, 36, 323–353. [Google Scholar] [CrossRef]
Balian, R. The Entropy-Based Quantum Metric. Entropy 2014, 16, 3878–3888. [Google Scholar] [CrossRef]
Balian, R. François Massieu et les Potentiels Thermodynamiques, Évolution des Disciplines et Histoire des Découvertes; Académie des Sciences: Paris, France, 2015. [Google Scholar]
Berthet, Q.; Blondel, M.; Teboul, O.; Cuturi, M.; Vert, J.-P.; Bach, F. Learning with Differentiable Perturbed Optimizers. arXiv preprint 2002, arXiv:2002.08676. [Google Scholar]
Blondel, M.; Martins, A.F.T.; Niculae, V. Learning with Fenchel-Young Losses. J. Mach. Learn. Res. 2020, 21, 1–69. [Google Scholar]
Wainwright, M.J.; Jordan, M.I. Graphical Models, Exponential Families, and Variational Inference. Found. Trends Mach. Learn. 2007, 1, 1–305. [Google Scholar] [CrossRef]
Yahyai, M. Représentations Étoile du Revêtement Universel du Groupe Hyperbolique et Formule de Plancherel. Ph.D. Thesis, Université de Metz, Metz, France, 23 June 1995. [Google Scholar]
Bertrand, J.; Irac-Astaud, M. Characterization of su(1,1) coherent states in terms of affine wavelets. J. Phys. A Math. Gen. 2002, 35, 7347–7357. [Google Scholar] [CrossRef][Green Version]

Figure 1. Illustration of Statistics of Doppler Spectrum as densities of points in Poincaré polydisc where SU(1,1) Lie group act transitively. Top is a time-doppler frequency spectrum of a drone. Bottom is a Doppler spectrum fluctuation coded as densities in Poincaré Polydisk.

Figure 2. Illustration of times series of local Frenet-Serret Frame where SE(3) Lie group act between two successive frames.

Figure 3. Fondamental Equation of Souriau Lie groups Thermodynamics. Q is the geometric heat in dual Lie algebra, β is the geometric temperature in Lie algebra.

Figure 4. Souriau Model of Lie Groups Thermodynamics.

Figure 5. Souriau-Fisher metric as extension of KKS 2-form in case of non-null Cohomogy.

Figure 6. Supervised/Non-Supervised Machine Learning based on Lie Groups Thermodynamics.

Figure 7. Jean-Marie Souriau, student at ENS Paris 1942.

© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Barbaresco, F. Lie Group Statistics and Lie Group Machine Learning Based on Souriau Lie Groups Thermodynamics & Koszul-Souriau-Fisher Metric: New Entropy Definition as Generalized Casimir Invariant Function in Coadjoint Representation. Entropy 2020, 22, 642. https://doi.org/10.3390/e22060642

AMA Style

Barbaresco F. Lie Group Statistics and Lie Group Machine Learning Based on Souriau Lie Groups Thermodynamics & Koszul-Souriau-Fisher Metric: New Entropy Definition as Generalized Casimir Invariant Function in Coadjoint Representation. Entropy. 2020; 22(6):642. https://doi.org/10.3390/e22060642

Chicago/Turabian Style

Barbaresco, Frédéric. 2020. "Lie Group Statistics and Lie Group Machine Learning Based on Souriau Lie Groups Thermodynamics & Koszul-Souriau-Fisher Metric: New Entropy Definition as Generalized Casimir Invariant Function in Coadjoint Representation" Entropy 22, no. 6: 642. https://doi.org/10.3390/e22060642

APA Style

Barbaresco, F. (2020). Lie Group Statistics and Lie Group Machine Learning Based on Souriau Lie Groups Thermodynamics & Koszul-Souriau-Fisher Metric: New Entropy Definition as Generalized Casimir Invariant Function in Coadjoint Representation. Entropy, 22(6), 642. https://doi.org/10.3390/e22060642

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Lie Group Statistics and Lie Group Machine Learning Based on Souriau Lie Groups Thermodynamics & Koszul-Souriau-Fisher Metric: New Entropy Definition as Generalized Casimir Invariant Function in Coadjoint Representation

Abstract

1. Introduction

1.1. State of the Art and Trends in Machine Learning Based on Information Geometry

1.2. Objectives of this Paper

1.3. Motivation of Lie Group Machine Learning with Use-Cases

1.3.1. SU(1,1) Lie Group Machine Learning for Doppler Data Statistics Analysis

1.3.2. SE(2) and SE(3) Lie Groups Machine Learning for Kinematics Data Statistics Analysis

2. New Results Introduced in the Paper

3. Learning Inference Lie Groups Thermodynamics and Covariant Gibbs Density

3.1. Inference by Natutal Gradient and Legendre Structure

3.2. Souriau Lie Groups Thermodynamique and Souriau-Koszul-Fisher Metric

3.3. Souriau Entropy and Souriau-Fisher-Koszul Metric Invariance under the Action of the Group and Covariant Souriau Gibbs Density

3.4. Covariant Souriau Gibbs Density and Information Manifold Foliation

4. Mathematical Definition of Souriau Moment Map

4.1. Operations on Vector Fields

4.2. Derivative Rules by Sophus Lie, Elie Cartan and Henri Cartan

4.3. Souriau Moment Map

5. Poincaré Unit Disk, SU(1,1) Lie Group and Souriau Moment Map

5.1. Poincaré Unit Disk and SU(1,1) Lie Group

5.2. Coadjoint Orbit of SU(1,1) and Souriau Moment Map

6. Covariant Gibbs Density by Souriau Thermodynamics for Poincaré Unit Disk

6.1. Fourier Transform, Laplace Transform and Lie Group Representation Theory

6.2. Souriau Covariant Gibbs Density in Poincaré Unit Disk for SU(1,1) Lie Group

6.3. Extension to SU (p,q) Unitary Group for Siegel Unit Disk

7. Lie Groups Thermodynamics for SE(2) Lie Group

8. New Entropy Definition as Generalized Casimir Invariant Functions for Coadjoint and Adjoint Representation

8.1. Casimir Invariant and Generalized Casimir Invariant

8.2. Souriau Entropy as Generalized Casimir Invariant in Coadjoint Representation

8.3. Souriau Entropy Invariance in Coadjoint Representation

8.4. Souriau Entropy Given by Casimir Invariant Functions Equations

8.5. Characterization of Generalized Casimir Invariant Functions in Coadjoint Representation

8.6. Constructing Generalized Casimir Invariant Functions in Coadjoint Representation

9. Conclusion: Lie Groups Thermodynamics for Machine Learning

Funding

Conflicts of Interest

Appendix A. Coadjoint orbits and Moment Map for SU(1,1)

Appendix A.1. Group of Unit Disk Automorphisms

Appendix A.2. Universal covering of S L ( 2 , R )

Appendix A.3. Coadjointes Orbits of S U ( 1 , 1 )

Appendix A.4. Quantization of Kähler Manifold

Appendix B. Bargman Parameterization of SU(1,1)

Appendix C. Shirokov Method to build Casimir function

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Appendix A.2. Universal covering of $S L (2, R)$

Appendix A.3. Coadjointes Orbits of $S U (1, 1)$