Dynamical Systems over Lie Groups Associated with Statistical Transformation Models †

: A statistical transformation model consists of a smooth data manifold, on which a Lie group smoothly acts, together with a family of probability density functions on the data manifold parametrized by elements in the Lie group. For such a statistical transformation model, the Fisher– Rao semi-deﬁnite metric and the Amari–Chentsov cubic tensor are deﬁned in the Lie group. If the family of probability density functions is invariant with respect to the Lie group action, the Fisher–Rao semi-deﬁnite metric and the Amari–Chentsov tensor are left-invariant, and hence we have a left-invariant structure of a statistical manifold. In the present work, the general framework of statistical transformation models is explained. Then, the left-invariant geodesic ﬂow associated with the Fisher–Rao metric is considered for two speciﬁc families of probability density functions on the Lie group. The corresponding Euler–Poincaré and the Lie–Poisson equations are explicitly found in view of geometric mechanics. Related dynamical systems over Lie groups are also mentioned. A generalization in relation to the invariance of the family of probability density functions is further studied.


Introduction
The present paper deals with statistical transformation models from the viewpoint of dynamical systems theory and geometric mechanics.
In the information geometry of statistical inference on a smooth manifold of data, the Fisher-Rao (semi-definite) metric and the Amari-Chentsov cubic tensor are defined in relation to a family of probability density functions on the sample manifold with parameters in another smooth manifold. It is standard to study the α-connections and the α-geodesics induced by the Fisher-Rao metric and the Amari-Chentsov tensor on the parameter manifold, particularly when they give rise to a dually flat structure. See, e.g., [1][2][3] for general discussions in information geometry.
The statistical transformation models appear in a special case where the parameter manifold is a Lie group and it also acts smoothly on the data manifold. See [2,4] for references on statistical transformation models. For such a statistical transformation model, the Fisher-Rao (semi-definite) metric and the Amari-Chentsov tensor are considered on the parameter Lie group, which turn out to be left-invariant under suitable invariance conditions about the family of probability density functions. If the Fisher-Rao metric is definite and hence a Riemannian metric on the Lie group, the geodesic flow is formulated as a dynamical systems on the (co)tangent bundle of the Lie group. More generally, even if the Fisher-Rao metric is not definite, the geodesic flow can be considered on the (co)tangent bundle of the quotient space of the Lie group by a subgroup under some additional conditions.
In the present paper, one describes the formulation of the geodesic flows associated to the Fisher-Rao (semi-definite) metric for statistical transformation models as dynamical systems over the Lie group of the parameters. A general account on the statistical transformation models is briefly given. Then, two concrete examples, respectively, over the semi-direct product Lie group R + R n and over the arbitrary compact semi-simple Lie groups, are studied and their geodesic flows are explicitly described using the techniques in geometric mechanics. By virtue of the symmetry of the geodesic flows, one shows, as the main results of the present paper, that it is explicitly described as an Euler-Poincaré equation in the Lagrangian formalism or as a Lie-Poisson equation in the Hamiltonian formalism. Some of their properties are also discussed.
The relations between dynamical systems, particularly integrable systems, and the information geometry have been discussed in the earlier works [5][6][7][8] by Nakamura as well as [9] by Fujiwara and Amari. In view of the authors, however, further studies can be carried out, and the present paper may shed brighter light in such a direction.
The structure of the present paper is as follows: In Section 2, the general framework of statistical transformation models is discussed. The Fisher-Rao (semi-definite) metric and the Amari-Chentsov cubic tensor are defined when the family of probability density functions satisfies either the invariance or the relative invariance with respect to the Lie group action. The invariant families of probability density functions are already discussed in [10,11] from the viewpoint of dynamical systems theory, whereas the relatively invariant family has not been discussed. See [4] for general notions of invariant measures on the parameter spaces of the statistical transformation models.
In Section 3, two concrete examples of statistical transformation models are intensively studied. First, the family of Gaussian probability density functions on R n for which the pair of the standard deviation and the mean stands for a point in the parameter Lie group R + R n . The Lie-Poisson equation for the geodesic flow is explicitly given, and it is further shown that the system is completely integrable in the sense of Liouville. Second, a family of probability density functions is introduced in arbitrary compact semi-simple Lie groups in the motivation by the description of Toda lattice equations as double bracket equations as in [12]. In this case, the geodesic flow can be described as an Euler-Poincaré equation on the Lie algebra or a Lie-Poisson equation on its dual space. The complete integrability of the system is still not known to the authors. The latter case is discussed in [10,11], but it is worthwhile mentioning it here to compare it with the former case.

General Framework of Statistical Transformation Models
In this section, we review the general framework of statistical transformation models. Fundamental references on this topic can be found, e.g., in [1][2][3]. See also the works by Chentsov and Morozova [13][14][15].

General Definitions of Fisher-Rao (Semi-Definite) Metric and Amari-Chentsov Cubic Tensor
Let M be a smooth manifold of data and dvol M a suitable volume form on it. In statistical inferences, we often think about a family of probability density functions ρ(u, x) on M, where x ∈ M, parametrized by the point u ∈ U in another smooth manifold U. For a random variable f : M → R, we denote its expectation with respect to ρ by which is a function in u ∈ U. Now, the Fisher-Rao (semi-definite) metric on U is defined as where (u 1 , . . . , u n ) are local coordinates of U. In relation to the Fisher-Rao metric, we should point out the famous Cramér-Rao Theorem, which gives a lower bound of the variant and covariant matrix in terms of the Fisher-Rao metric, when it is positive-definite. See, e.g., [1,3].
where (u 1 , . . . , u n ) are local coordinates of U. By Fisher-Rao metric and the Amari-Chentsov cubic tensor, we can consider the α-connections. One of the most well-investigated geometric structures around the α-connection is the so-called dually flat structures. See, e.g., [1][2][3].

Statistical Transformation Models
Now, we focus on the specific situation where the parameter space U is a Lie group that acts smoothly on the data manifold. We follow the arguments in [4] and [2] ( §8.3), but the notations are arranged for the later use.
As above, let M be a smooth manifold of data. We assume that a Lie group G smoothly acts on M and that the manifold M has a volume form dvol M that is relatively invariant with respect to the Lie group action, i.e., dvol Here, χ : G → R + denotes a real one-dimensional representation of the Lie group G. We further take a family of probability density functions parametrized by the Lie group G: Since ρ is a probability density function, we have M ρ(g, x)dvol M (x) = 1 for any g ∈ G.
On top of it, we require the invariance or the relative invariance of the family of probability density functions with respect the action by G, which are respectively written as and for arbitrary g, h ∈ G and x ∈ M.

Remark 1.
In the previous papers [10,11] by the authors, the invariance (1) has appeared in a different manner. In fact, we should suppose the invariance as in (1) to obtain the left-invariant Fisher-Rao metric rather than the one appeared in [10,11]. It should be pointed out that we have g ρ(g, x), X ∈ g, (g, x) ∈ G × M, with X M and X (R) being, respectively, the fundamental vector field and the right-invariant vector field associated with X.
In [10], this formula was written as X M , which is not correct.
As in the previous subsection, we now define the Fisher-Rao (semi-definite) metric and the Amari-Chentsov cubic tensor. Using the invariance (1) and the relative invariance (2) of the family of probability density functions, these tensors turn out to be left-invariant.
The expectation of a random variable f : M → R, which is assumed to be a smooth function, with respect to the density function ρ, is given as We denote the Lie algebra of G by g. For an element X ∈ g, the corresponding leftinvariant vector field on G is denoted by X (L) . We can verify the following formulae by straightforward computations: Definition 1. On the Lie algebra g, the Fisher-Rao bilinear form is the positive-semi-definite bilinear form The associated left-invariant (0, 2)-tensor on G, denoted by the same symbol ·, · , is called the Fisher-Rao semi-definite metric.
Note that the Fisher-Rao bilinear form, as well as the Fisher-Rao semi-definite metric, is not necessarily positive-definite.
Similarly, the cubic multi-linear form can be shown to be independent of g ∈ G. If the Fisher-Rao semi-definite metric is positive-definite, we call it the Fisher-Rao Riemannian metric on G. Under this condition, the Levi-Civita connection ∇ on G is naturally defined. Then, the associated α-connection ∇ (α) on G is defined through where X, Y, Z ∈ g are arbitrary. Note that the connections ∇ (α) and ∇ are left-invariant.
Recall that the Fisher-Rao tensor ·, · is not necessarily positive-definite in general. Nevertheless, we can consider the induced Riemannian metric, which we also call the Fisher-Rao Riemannian metric, on the quotient manifold of G by a Lie subgroup under some additional conditions. Below, we focus on the geodesic flow with respect to the Fisher-Rao Riemannian metric.
For simplicity, the Lie group G is now supposed to be a compact semi-simple Lie group and the Lie algebra g is equipped with the negative-definite Killing form κ. Further, we assume that there exists a Lie subgroup H ⊂ G whose Lie algebra h gives rise to the orthogonal decomposition g = h+m into the direct sum of two vector subspaces with respect to κ. Here, m denotes the orthogonal complement to h with respect to κ. If the restriction of the Fisher-Rao semi-definite metric ·, · to m is definite, then the quotient space G/H is equipped with the positive-definite Riemannian metric.
By a result of Nomizu in ( [16], Theorem 13.1), the induced Levi-Civita connection on G/H with respect to the Fisher-Rao Riemannian metric is written as where Here, for any W ∈ g = h+m, W m denotes its m-component.

Dynamical Systems Associated to Two Concrete Statistical Transformation Models
In this section, we describe the geodesic flow equations for two concrete statistical transformation models with respect to the Fisher-Rao metric. The method we use here owes the Euler-Poincaré and the Lie-Poisson reduction procedures in geometric mechanics.

A Location-Scale Model for Gaussian Probability Density Functions
We consider the case where the data manifold is an n-dimensional Euclidean space M = R n on which we consider the action of the semi-direct product group R + R n : Recall that for the semi-direct product R + R n of the group of the positive real numbers R + and that of the n-dimensional vectors R n , the group operation is written as (σ, µ) * (σ , µ ) = (σσ , σµ + µ), (σ, µ), (σ , µ ) ∈ R + R n . In particular, we have (σ, µ) −1 = (σ −1 , −σ −1 µ). Now, we consider the family of Gaussian probability density functions on R n . Clearly, this family of probability density functions is relatively invariant in the sense of (2), where we set χ((σ, µ)) = σ n , (σ, µ) ∈ R + R n . We identify the Lie algebra g of R + R n with R × R n equipped with the Lie bracket (s, m), (s , m ) = 0, sm − s m , (s, m), (s , m ) ∈ R × R n .
As a set of generators for g, we take X = (1, 0), Y i = (0, e i ), i = 1, · · · , n, where e i ∈ R n is the standard basis vector with the unique non-zero component 1 at the i-th entry. The corresponding left-invariant vector fields are written as µ i e i . Note that the non-trivial commutation relations between these generators are given as [X, According to the definition, the Fisher-Rao metric is calculated as where the dot · denotes the standard inner product of R n . Clearly, the Fisher-Rao metric is a Riemannian metric in this case. Now, using the Lie-Poisson reduction of the Hamiltonian systems for the left-invariant Riemannian metric on the cotangent bundle to Lie groups, we can describe the dynamical system for the geodesic flow by the Lie-Poisson equation on the dual to the Lie algebra. See [17][18][19]. On the dual g * to the Lie algebra g of the semi-direct product R + R n , which we identify with R × R n through the inner product (s, m), (s , m ) = ss + m · m for (s, m), (s , m ) ∈ R × R n , we define the standard Lie-Poisson bracket through where F, G ∈ C ∞ (R × R n ), (s, m) ∈ R × R n . Here, (∇ s F, ∇ m F) denotes the gradient of F with respect to ·, · written as a pair of components in R × R n . Note that the functions As the generic coadjoint orbits are of dimension two, this Hamiltonian system is in fact a completely integrable system in the sense of Liouville. Note that, if n = 1, the Equation (4) agrees with the geodesic equation on the Poincaré upper half-plane. See standard textbooks on differential geometry of surfaces such as [20].

A Model for a Class of Probability Density Functions on Compact Semi-Simple Lie Groups
In this subsection, we consider the special case where the sample manifold M coincides with the compact semi-simple Lie group G. In this case, the Lie algebra g of G is equipped with the negative-definite Killing form κ. Now, we take the family of probability density functions ρ given as ρ(g, h) := c · exp F g −1 h , g, h ∈ G.
Here, F(θ) = −κ(Q, Ad θ N), θ ∈ G, where Q, N are fixed elements in g. The Haar measure on G is written as dvol G . The positive constant c is chosen in such a way that ρ is a probability density function:

Remark 2.
The function F appears in [12], which describes the Toda lattice equations as double bracket equations on the split real forms of complex semi-simple Lie algebras. More precisely, the double bracket equation is associated with the gradient vector field of the cost function F on special orbits.

Remark 3.
The definition of the probability density functions ρ is slightly different from the one in the previous papers [10,11] by the authors. To obtain the left-invariant Fisher-Rao metric, we should use the definition in the present paper rather than the one that appeared in [10,11].
The Fisher-Rao semi-definite bilinear form ·, · on g is then calculated as X, Y = −c · κ [X, Q], Y, N , X, Y ∈ g, dynamical system appearing in the statistical transformation models for the Gaussian probability density functions, which was shown to be completely integrable. A detailed analysis of the systems in (4) and (8) will be carried out in future works.