Geometric Structures Induced by Deformations of the Legendre Transform

Pablo A. Morales; Jan Korbel; Fernando E. Rosas

doi:10.3390/e25040678

,

and

¹

Research Division, Araya Inc., Tokyo 107-6019, Japan

²

Section for Science of Complex Systems, Center for Medical Data Science, Medical University of Vienna, Spitalgasse, 23, 1090 Vienna, Austria

³

Complexity Science Hub Vienna, Josefstädter Strasse 39, 1080 Vienna, Austria

⁴

Department of Informatics, University of Sussex, Brighton BN1 9RH, UK

Entropy2023, 25(4), 678;https://doi.org/10.3390/e25040678

This article belongs to the Special Issue Information Geometry and Its Applications

Version Notes

Order Reprints

Abstract

The recent link discovered between generalized Legendre transforms and non-dually flat statistical manifolds suggests a fundamental reason behind the ubiquity of Rényi’s divergence and entropy in a wide range of physical phenomena. However, these early findings still provide little intuition on the nature of this relationship and its implications for physical systems. Here we shed new light on the Legendre transform by revealing the consequences of its deformation via symplectic geometry and complexification. These findings reveal a novel common framework that leads to a principled and unified understanding of physical systems that are not well-described by classic information-theoretic quantities.

Keywords:

information theory; entropy measures; information geometry

1. Introduction

The Legendre transform [1] plays a key—albeit perhaps not always transparent—role in many areas of mathematical physics. Specifically, it allows for the identification of dual coordinates and potentials that yield theories in terms of more convenient variables, being instrumental in diverse areas in physics ranging from relativistic field theory to condensed matter physics. Applications of the transform have their roots in classical physics—in analytical mechanics serving as a link between its Lagrangian and Hamiltonian formulations, and in thermodynamics bridging intensive and extensive variables. These notions have led to more general frameworks which, in turn, gave rise to the development of symplectic topology [2].

Far from being a relic, the Legendre transform still plays an important role in contemporary physics. It plays an important role in classical field theory, where the index of pairs of components becomes continuous. It is also used in quantum field theory, where it relates the generator of connected Green functions to the quantum effective action, i.e., the generator of one-particle irreducible Green functions. Furthermore, the relevance of the Legendre transform has lead to generalizations in the context of perturbative quantum field theories [3,4]. Overall, the transform continues to be at the core of important developments in current research.

The Legendre transform also plays a fundamental role in information geometry, where it mediates the relationship between primal and dual coordinates within the non-Riemannian geometry induced by dually flat statistical manifolds [5]. This duality gives rise to relationships of orthogonality in these geometries, corresponding to alternative representations of physical systems based on control parameters or expectation values [6]. Interestingly, the generalized Legendre transform naturally arises in curved (i.e., non-Euclidean) statistical manifolds [7,8], which establishes a rigorous and highly non-trivial link with Rényi’s divergence and entropy [9,10,11]. These recent findings suggest the existence of a fundamental reason that could explain why Rényi entropy and divergence naturally appear in a range of physical phenomena of interest. In effect, recent applications of Rényi measures to physics includes quantum systems [12,13], strongly coupled or entangled systems [14,15,16], phase transitions [12,17,18], and multifractal thermodynamics [19,20], among others. However, these early findings on the link between the generalized Legendre transforms and curved geometries still provide little intuition on the nature of this relationship and its meaning and implications for physical systems in general.

The goal of this article is to shed new light on the generalized Legendre transform by investigating its geometric implications. For this purpose, we characterize deformations in the Legendre transform and relate them with generalizations of the Bregman divergence, which are naturally associated with curved statistical manifolds. By leveraging these tools, our contribution focuses on two domains: geometrical aspects related to phase-space flow and manifold complexification. Our results show how the symplectic structure induced by the deformed Legendre transforms leads to a modification of what is understood as a ‘canonical pair,’ which in turn illuminates the nature of the corresponding maximum entropy distributions. Furthermore, our results bring new insights to the relationship between the Kullback–Leibler divergence (related to the Shannon entropy),

α

-divergence (related to Tsallis’ entropy), and the Rényi divergence via manifold complexification and Kähler manifolds. The complex geometry yields new conditions on the possible values of the manifold curvature, which are closely related to holomorphic polarization. Additionally, we report on the thermodynamic aspects related to the deformed Legendre transform in Ref. [21]. Taken together, these results lead to a larger, unified picture that extends standard geometric and thermodynamic relationships associated with classic information-theoretic quantities such as Shannon’s entropy.

The rest of the paper is structured as follows. First, Section 2 provides a brief overview on the standard interpretation of the Legendre transform in mathematical physics. Then, Section 3 explores how the transform naturally arises in information geometry and introduces the intimate relationship that exists between a generalized Legendre transform and the curvature of statistical manifolds. Building on these foundations, Section 4 investigates the consequences of generalized Legendre transforms on the symplectic structure and flows and on the complexification of statistical manifolds. Finally, Section 5 summarizes our mains conclusions.

2. Preliminaries

The Legendre transform is, at its core, an exploration of the properties of convex functions. Despite its importance, the transform is unfortunately typically introduced as an obscure algebraic ‘trick’, with no explanation of why it plays such an important role in many different areas of physics. For completeness, this section presents a basic standard interpretation of the Legendre transform in mathematical physics, which is then complemented by a deeper view based on information geometry in Section 3.

The most straightforward interpretation of the Legendre transform comes from the geometry of graphs of functions [22]. In this view, the Legendre transform of a convex function F is another function G that keeps track of the (negative) height at which the tangent to F touches the y-axis, which is usually reparametrized in terms of the slope of F. This view is easy to grasp, but unfortunately makes the construction seem arbitrary while failing to explain why this procedure is so fundamental.

A more principled view comes from an algebraic perspective as follows. If

F (x)

with

x = (x_{1}, \dots, x_{n}) \in R^{n}

is a strictly convex function (i.e., its Hessian is positive-definite), then the partial derivative

y_{i} (x) : = \partial F / \partial x_{i} (x)

is a monotonous function of

x_{1}, \dots, x_{n}

for

i = 1 \dots n

. This means that there exists an isomorphism between x and

y = (y_{1}, \dots, y_{n})

; said differently, there exist mappings

y_{i} (x)

and

x_{k} (y)

that transform one into the other. Using these mappings, it would be natural to consider the possibility of reparametrizing F in terms of y instead of x. However, instead of focusing on such reparametrization, an elegant move is to consider instead the function

G (y) = x \cdot y - F (x (y), y)

. Interestingly, the resulting pair

F (x)

and

G (y)

exhibit the following symmetry:

\frac{\partial G}{\partial y_{k}} = x_{k}, \frac{\partial F}{\partial x_{i}} = y_{i} .

(1)

Useful properties of this transformation are that it preserves convexity (i.e., the transform of a convex function results into another convex function) and it is an ‘involution’, that is, the Legendre transform of the transform of a convex function is the function itself. The symmetry of these relationships is graphically represented in the right-hand side of Figure 1.

Figure 1. A graphical representation of the Legendre transform and its deformations. (Left:) While the standard Legendre transform acts on concave duals, the deformed one acts between C-concave functions. Each transform brings elements of one space to the other. Please note that while Section 2 presents the classical view of Legendre transforms acting over convex functions, the rest of this work follows Ref. [9] in focusing on concave functions. (Right:) The symmetry that governs the algebraic relationships between convex dual functions and dual coordinates, which is mediated by the Legendre derivative operator

D_{L}

, which differs from the standard Euclidean gradient when the transform is deformed.

Overall, one can think of the Legendre transform as acting on two inputs, x and F, and providing two outputs: the dual variable y and the convex conjugate G (similarly, the Fourier transform of a time series

F (t)

can be thought of as giving two outputs too: the spectrum of amplitudes

G (s)

(analogous to the conjugate function) and the frequency domain s itself (analogous to the dual variable)). Pairs of convex functions

{F, G}

satisfying Equation (1) are known as convex duals, with

{x, y}

being known as dual variables. Additionally, convex functions and their duals satisfy the Fenchel inequality

F (x) + G (y) \geq x \cdot y

. The multiple useful properties of Legendre duals are leveraged in various areas of mathematics and engineering, particularly in convex optimization [23].

A more general definition of the Legendre transform of a convex function is given by

G (y) = sup_{x} {C (x, y) - F (x)} .

(2)

This definition applies even when F is not everywhere differentiable, and recovers the above procedure for the case where

C (x, y) = x \cdot y

. For other choices of C, this opens the door to so-called “deformed” Legendre transforms, which play an important role in optimal transport theory [24]. Interestingly, dual functions according to these generalized Legendre transforms satisfy relationships analogous to Equation (1), but where the role of the Euclidean gradient is replaced by a ‘Legendre derivative’ operator

D_{L}

, which is formally defined in Section 3.4. The goal of this paper is to explore the implications of such deformations of the Legendre transforms for physical systems.

3. Legendre Transform in Information Geometry

In this section we present the key role of the Legendre transform in statistical manifolds. For this purpose, Section 3.1 first introduces the necessary background about information geometry to the unfamiliar reader. Then, Section 3.2 explains how the standard Legendre transform describes the geometry of dually flat spaces, which are naturally associated with the Kullback–Leibler divergence and the Shannon entropy. Building on this, Section 3.3 then presents how other divergences lead to more general geometries, and Section 3.4 develops how generalized Legendre transforms are a natural way to build and describe them. Please note that hereafter we use Einstein’s summation convention for convenience of the notation.

3.1. The Dual Structure of Statistical Manifolds

Our exposition is focused on statistical manifolds

M

whose elements are probability distributions

p_{ξ} (s)

, with

s \in S

being the possible events accounted for by the probability distribution and

ξ \in O \subset R^{d}

with O an open subset of a set of parameter values. The geometry of such statistical manifolds is determined by two structures: a metric tensor g and a torsion-free affine connection pair

(\nabla, \nabla^{*})

that are dual with respect to g. Intuitively, g establishes norms and angles between tangent vectors and, in turn, establishes curve length and the shortest curves. On the other hand, the affine connection establishes covariant derivatives of vector fields establishing the notion of parallel transportation between neighboring tangent spaces, which defines what is a straight curve.

Traditional Riemannian geometry is built on the assumption that the shortest and the straightest curves locally coincide, which is pivotal to the development of general relativity. This assumption leads to the study of metric-compatible Levi–Civita connections, as its geodesics are locally distance-minimizing and satisfy

\nabla = \nabla^{*}

and are, hence, completely determined from the metric. However, modern approaches motivated in information geometry [25] and gravitational theories [26,27] consider more general scenarios, where connections may not be derivable from the metric. In such geometries, the parallel transport operator

Π : T_{p} M \to T_{q} M

and its dual

Π^{*}

(the dual transport operator acts on cotangent vectors and is defined by the condition of guaranteeing

g_{q} (Π V, Π^{*} W) = g_{p} (V, W)

for all

W \in T_{p} M

and

V \in T_{p}^{*} M

) induced by ∇ and

\nabla^{*}

, respectively might differ. The departure of ∇ and

\nabla^{*}

from self-duality can be shown to be proportional to Chentsov’s tensor, which allows for a single degree of freedom traditionally denoted by

α \in R

[25]. Put simply,

α

captures the degree of asymmetry between short and straight curves, with

α = 0

corresponding to metric-compatible connections where

\nabla = \nabla^{*}

.

An important property of the geometry of a statistical manifold (

M, g, \nabla, \nabla^{*}

) is its curvature, which can be of two types: the (Riemann–Christoffel) metric curvature or the curvature associated to the connection. Both quantities capture the distortion induced by parallel transport over closed curves, the former with respect to the Levi–Civita connection and the latter with respect to ∇ and

\nabla^{*}

. In the sequel, we use the term curvature to refer exclusively to the latter type. Statistical manifolds with zero curvature (equivalently, manifolds where it is possible to find a coordinate chart pair under which the connections and its dual vanish for any point of the manifold) are said to be dually flat.

3.2. Dually Flat Geometry, Bregman Divergences, and the Legendre Transform

The geometry of Riemannian manifolds is typically formulated in terms of a single set of local coordinates. However, the fact that non-Riemannian manifolds have two dissimilar affine connections ∇ and

\nabla^{*}

makes it more natural to describe their geometry in terms of two dual coordinates

ξ

and

η

[25]. Specifically, while in Riemannian geometry orthogonality can be assessed between the different dimensions of a single set of coordinates, in statistical manifolds it is more fruitful to consider orthogonality between elements of the primal

ξ

and dual coordinates

η

[6,10]. A standard example of dual coordinates in a statistical manifold is where

ξ

corresponds to the natural parameters of an exponential family distribution and

η

corresponds to the corresponding expectation values. In the sequel, we follow Schouten’s notation in which upper indices are reserved for dual coordinates, i.e.,

\partial_{i} = \frac{\partial}{\partial ξ^{i}} and \partial^{i} = \frac{\partial}{\partial η_{i}} .

(3)

Under this notation,

\partial_{i}

gives rise to a basis for the tangent space

T_{p} M

, while

\partial^{i}

is related to a natural dual basis of the cotangent space

T_{p}^{*} M

.

A Riemannian metric is always “locally flat”, i.e., it can be brought down to its signature (a Kronecker delta) at a given point

p \in M

by choosing an appropriate coordinate chart. It is not guaranteed, however, that such a chart would preserve the delta at a neighborhood of p; finding a chart that satisfies this property globally is the hallmark of a flat geometry. Analogously, affine geometries are also locally flat when considering its dual entry, therefore satisfying

g (\partial_{i}, \partial^{j}) = δ_{i}^{j}

for an appropriate pair of primal and dual coordinate charts

{ξ^{i}, η_{i}}

at some point p. In a similar fashion, this property in general only holds locally; dually flat geometries are characterized by the fact that one can find a pair of coordinates that satisfies this condition of orthogonality on the whole manifold (under these coordinate charts, one can show that both the connections and its dual are vanishing, hence the term dual flatness).

For an orthogonal pair

{ξ, η}

of a given dually flat manifold, the gradients of the mappings

ξ \mapsto η

and

η \mapsto ξ

are both symmetric. To confirm this, let us first note that

g_{i j} = g (\partial_{i} η_{k} \partial^{k}, \partial_{j}) = \partial_{i} η_{k} g (\partial^{k}, \partial_{j}) = \partial_{i} η_{k} δ_{j}^{k} = \partial_{i} η_{j},

where the first equality follows from the chain rule of derivatives

\partial_{i} = \partial_{i} η_{k} \partial^{k}

. Then, using the fact that Riemannian metrics are always symmetric, one can see that

\partial_{i} η_{j} = g_{i j} = g_{j i} = \partial_{j} η_{i}

. A similar derivation shows that

g^{i j} = \partial^{i} ξ^{j}

, and hence

\partial^{i} ξ^{j} = \partial^{j} ξ^{i}

(note that

g_{i j} = \partial_{i} η_{j}

and

g^{i j} = \partial^{i} ξ^{j}

is consistent with the fact that for orthogonal coordinates

g (\partial^{i}, \partial_{k}) = g^{i j} g_{j k} = δ_{k}^{i}

).

There is an intimate relationship between an orthogonal pair of coordinates in a dually flat manifold and the Legendre transform. To see this, we first note that the symmetry of the Jacobian of

ξ \to η

implies the existence of a closed 1-form

d ω = 0

, and this—via Poincare Lemma—implies in turn the existence of a scalar potential

ψ \in C^{\infty}

that satisfies

η_{i} = \partial_{i} ψ and g_{i j} = \partial_{i} \partial_{j} ψ .

(4)

Note that the second condition, combined with the fact that

g_{i j}

is positive-semidefinite, implies that

ψ

is convex. By a similar line of reasoning, the symmetry of

g^{i, j}

induces a dual convex potential

φ

that satisfies

ξ^{i} = \partial^{i} φ and g^{i j} = \partial^{i} \partial^{j} φ .

(5)

Furthermore, a direct calculation shows that the dual potentials

ψ (ξ^{1}, . . ., ξ^{n})

and

φ (η_{1}, . . ., η_{n})

always satisfy

d {ψ + φ - ξ^{i} η_{i}} = 0

. This implies that, modulo an unimportant constant, the following relationship holds over any dually flat manifold (Equation (6) holds on any manifold but only locally; in contrast, dually flat spaces are a special case in which dual potentials

φ, ψ

that satisfy Equations (4) and (5) can be defined over the whole manifold):

ψ + φ - ξ^{i} η_{i} = 0 .

(6)

Let us now consider the behavior of Equation (6) on dually flat spaces when the coordinates and potentials are evaluated at different points of the manifold. For this, let us denote as

ξ (p)

and

η (q)

the coordinates and dual coordinates of

p, q \in M

, respectively, and define the so-called Bregman divergence

D

as

D (p | | q) : = φ (η (p)) + ψ (ξ (q)) - ξ^{i} (q) η_{i} (p) .

(7)

Then, the differential of the mapping

q \mapsto D (p_{0} | | q)

is

\begin{matrix} d \{D (p_{0}, q)\} & = (\partial_{i} ψ (ξ (q)) - η_{i} (p_{0})) d ξ^{i} (q) \\ = (η_{i} (q) - η_{i} (p_{0})) d ξ^{i} (q) . \end{matrix}

(8)

From this, and considering that

D

by definition is a difference between a linear and two convex functions, one can verify that this mapping attains its unique minimum when

q = p_{0}

. Interestingly, at this minimal value one recovers Equation (6), which implies that

D = 0

. This shows that Bregman divergences are non-negative.

These results suggest an alternative definition for

φ

and

ψ

, conceiving them as a maximum of the following maps:

\begin{matrix} φ (η (p)) & = max_{q \in M} \{ξ^{i} (q) η_{i} (p) - ψ (ξ (q))\}, \end{matrix}

(9)

\begin{matrix} ψ (ξ (p)) & = max_{q \in M} \{η_{i} (q) ξ^{i} (p) - φ (η (q))\} . \end{matrix}

(10)

This reveals that the orthogonal coordinate pair is always dual in the Legendre sense, or equivalently, that dual flatness implies that the potentials are convex duals. This property generalizes the well-known Legendre duality between the natural and expectation parameters of an exponential family [28], showing that the same holds of any coordinate pair as long as they satisfy local flatness.

3.3. Divergences as a General Tool to Establish Geometries

This subsection explains how divergences, such as the one introduced in Equation (7), can be used as a convenient tool to establish a geometry on a statistical manifold ([29], Section 4). Importantly, this approach does not lack generality, as any geometry can be expressed from an appropriate divergence [30,31,32].

Divergences are a general class of functions that assess the dissimilarity of their arguments. More specifically, a divergence is a smooth, distance-like function

D [x; x^{'}]

that satisfies

D [x; x^{'}] \geq 0

and vanishes only when

x = x^{'}

. Divergences are more general—hence weaker—notions than distances, as they do not need to be symmetric in their arguments and may not respect the triangle inequality. Of the various types of divergences explored in the literature [33], two are particularly important: f-divergences (which are monotonic with respect to coarse-grainings of the domain of events

S

[34]) and Bregman divergences (studied in the previous section).

Let us show how divergences can be used to establish metrics and connections over manifolds. For this, let us use the shorthand notation

D [ξ; ξ^{'}] : = D (p | | q)

when expressing

D

in terms of coordinates

ξ = ξ (p)

and

ξ^{'} = ξ (q)

. Then, the Riemannian metric of the manifold is recovered from the second-order expansion of the divergence as follows:

g_{i j} (ξ) = ⟨\partial_{i}, \partial_{j}⟩ = - \partial_{i, j^{'}} D [ξ; ξ^{'}] |_{ξ = ξ^{'}},

(11)

which is positive-definite due to the non-negativity of

D

. This construction leads to the Fisher’s metric, which is the unique metric that emerges from a broad class of divergences ([29], Th. 5), with this being closely related with Chentsov’s theorem [35,36,37,38]. Similarly, connections emerge at the third-order expansion of the divergence as follows:

\begin{matrix} Γ_{i j k} (ξ) & = ⟨\nabla_{\partial_{i}} \partial_{j}, \partial_{k}⟩ = - {\partial_{i, j} \partial_{k^{'}} D [ξ; ξ^{'}]|}_{ξ = ξ^{'}}, \end{matrix}

(12a)

\begin{matrix} Γ_{i j k}^{*} (ξ) & = ⟨\nabla_{\partial_{i}}^{*} \partial_{j}, \partial_{k}⟩ = - {\partial_{k} \partial_{i^{'}, j^{'}} D [ξ; ξ^{'}]|}_{ξ = ξ^{'}} . \end{matrix}

(12b)

In summary, Fisher’s metric is insensible the choice of divergence but the resulting connections are, and therefore the effects of a particular

D

manifest only at the third order.

Bregman divergences always give rise to flat geometries, as for them,

\partial_{i, j} \partial_{k^{'}} D [ξ; ξ^{'}] = \partial_{k} \partial_{i^{'}, j^{'}} D [ξ; ξ^{'}] = 0

, and therefore other types of divergences are needed in order to establish curved non-Riemannian geometries. As mentioned in Section 3.1, the deviation of a given connection ∇ from its corresponding metric-compatible (i.e., Levi–Civita) counterpart can be measured by

α T

, where T corresponds to the invariant Amari–Chensov tensor [39,40] and

α \in R

is a free parameter. The invariance of T implies that the value of

α

entirely determines the connection, and the corresponding geometry can be obtained from a divergence of the form [10]

D_{α} (p | | q) = \frac{4}{1 - α^{2}} \int_{S} (1 - p^{\frac{1 - α}{2}} (s) q^{\frac{1 + α}{2}} (s)) d μ (s),

(13)

which is known as α-divergence. As important particular cases, if

α = 0

then

D_{α}

becomes the square of Hellinger’s distance, and if

α = \pm 1

then it gives the well-known Kullback–Leibler divergence. Furthermore, it can be shown that the Kullback–Leibler divergence is a Bregman divergence, which in turn implies that for those cases the resulting geometry is flat. This illustrates the fact that being Riemannian (i.e.,

α = 0

) and Euclidean (

α = \pm 1

) are independent features of a geometry.

We finish this subsection by noting that multiple divergences can give rise to the same geometry. A one-to-one relationship between divergence and geometries is obtained when considering conformal-projective equivalent classes of divergences, which are related both via conformal and projective transformations. For a more detailed explanation, we refer the interested reader to Ref. [10], Sec. 2-D.

3.4. Generalized Legendre Transforms as a Natural Way to Describe Curved Manifolds

Section 3.2 and Section 3.3 clarified the intimate relationship that exists between dually flat manifolds, Bregman divergences, and the Legendre transform. Here we explain how these relationships are altered in more complex geometries.

In curved geometries it is impossible to construct dual potentials that satisfy Equation (6) on the whole manifold. This impossibility is a symptom of the fact that the divergence that gives rise to this geometry, e.g., the

α

-divergence given in Equation (13), is not a Bregman divergence, but only an f-divergence [34]. To better understand the nature of the

α

-divergence, let us consider in detail its relationship with Bregman divergences. Bregman divergences, as given in Equation (7), can also be expressed as

\begin{matrix} D_{Φ} [ξ; ξ^{'}] = Φ (ξ^{'}) - Φ (ξ) - D Φ (ξ) \cdot (ξ^{'} - ξ) . \end{matrix}

(14)

Hence,

D_{Φ} [ξ; ξ^{'}]

measures how convex the function

Φ

is at

ξ

in the direction of

ξ^{'} - ξ

(this also explains the asymmetry that exists in the arguments of a Bregman divergence) and exploits the fact that a first-order approximation of a convex function always underestimates its value (i.e., that

Φ (ξ^{'}) \geq Φ (ξ) + D (ξ) \cdot (ξ^{'} - ξ)

, where

D

is the Euclidean gradient). Interestingly, such a first-order approximation can also be built on an intermediate point between

ξ

and

ξ^{'}

, which leads to

\frac{1 - α}{2} Φ (ξ) + \frac{1 + α}{2} Φ (ξ^{'}) \geq Φ (ξ_{α}),

(15)

where

ξ_{α} = \frac{1 - α}{2} ξ + \frac{1 + α}{2} ξ^{'}

, with

α \in (- 1, 1)

being a one-dimensional parameter that regulates how close

x_{α}

is to

ξ

and

ξ^{'}

. This inequality leads to a family of divergences [41] indexed by

α

, given by

D_{Φ}^{(α)} [ξ; ξ^{'}] : = \frac{4}{1 - α^{2}} [\frac{1 - α}{2} Φ (ξ) + \frac{1 + α}{2} Φ (ξ^{'}) - Φ (ξ_{α})],

(16)

where the factor

4 / (1 - α^{2})

is introduced so that the limit

{lim}_{α \to 1} D_{Φ}^{_{(α)}} = D_{Φ}

gives a Bregman divergence. In particular, if

Φ (ξ) = \sum_{i} e^{ξ_{i}}

then

D_{Φ}^{_{(α)}}

becomes the

α

-divergence. Importantly, divergences of the form of Equation (16) with

α \neq \pm 1

are not Bregman divergences (as they cannot be expressed in terms of convex conjugates as in Equation (7)), and hence they do not lead to flat geometries (see Section 3.3).

Fortunately, recent results suggest a way to express non-Bregman divergences in terms of generalized Legendre transforms [9]. The generalized Legendre transform is based on a link function (Link functions are typically used as cost functions driving optimization problems in the literature focused on optimal transport [24]) corresponds to a smooth function

C : M \times M \to R

, that connects generalized potentials

φ

and

ψ

via the following relationship:

ψ (ξ) + φ (η) - C (ξ, η) = 0,

(17)

which holds for all

(ξ, η)

pairs belonging to the C-superdifferential of

ψ

. In this manner,

η

can be interpreted as the C-supergradient of

ψ

at

ξ

[42]. Put differently, for a given link function C, a pair of generalized potentials are functions

φ, ψ

, which are related via a generalized Fenchel–Lengendre C-transform as follows:

\begin{matrix} φ (ξ (p)) & = inf_{q \in M} \{ψ (η (q)) - C (ξ (p), η (q))\}, \end{matrix}

(18a)

\begin{matrix} ψ (η (q)) & = inf_{p \in M} \{φ (ξ (p)) - C (ξ (p), η (q))\} . \end{matrix}

(18b)

Note that these equations use a different sign than Equation (2), which leads to the consideration of concave instead of convex functions. Arguments for adopting this choice are discussed in Ref. [9].

Following the rationale that led to Equation (7), for a given function C and C-conjugate potentials

φ, ψ

, one can define a generalized Bregman divergence (This divergence is known as a C-divergence, recently introduced in the context of optimal transport [42]), where C refers to the corresponding cost function. Here we use another term to stress its relationship with key geometric notions, given by

D (p | | q) = C (ξ (p), η (q)) - φ (ξ (p)) - ψ (η (q)) .

(19)

Equations (18a) and (18b) imply that

D (p | | q) \geq 0

, with equality if and only if

p = q

. Interestingly, while the metric induced by generalized Bregman divergences is the Fisher metric, Equations (12a) and (12b) imply that the connections are given by

\begin{matrix} Γ_{i j k} (ξ) & = - {\partial_{i, j} \partial_{k^{'}} C (ξ; η (ξ^{'}))|}_{ξ = ξ^{'}}, \end{matrix}

(20a)

\begin{matrix} Γ_{i j k}^{*} (ξ) & = - {\partial_{k} \partial_{i^{'}, j^{'}} C (ξ; η (ξ^{'}))|}_{ξ = ξ^{'}} . \end{matrix}

(20b)

If

C (ξ, η) = ξ \cdot η

then

Γ_{i j k} (ξ) = Γ_{i j k}^{*} (ξ) = 0

, and hence curved geometries in this construction only arise from non-trivial link functions, i.e., from deformations of the Legendre transform.

For the dual geometries that arise from the

α

-divergence, one can identify the corresponding link function following a two-step procedure. First, one applies a monotonous transformation that turns the

α

-divergence into the Rényi divergence [43] of order

γ

(Note that we follow Ref. [44] in adopting a shifted indexing, thereby referring to

γ = n - 1

as the order of Rényi’s entropy, with

n \geq 0

corresponding to the order in the standard definition):

\begin{matrix} D_{γ} (p | | q) = \frac{1}{γ} log \int_{S} p^{γ + 1} (s) q^{- γ} (s) d μ (s), \end{matrix}

(21)

related to the

α

parameter of divergence (13) as

α = - 1 + 2 γ

and leveraging the fact that both divergences generate the same geometry, being part of the same conformal-projective equivalent class ([10], Sec. 2-D). Note that when

γ \to 0

, C tends to

ξ \cdot η

, and the Rényi divergence tends to the Kullback–Leibler divergence. As a second step, one uses the fact that the Rényi divergence can be expressed in terms of generalized convex conjugates ([9], Th. 13), and hence it can be recovered as a generalized Bregman divergence as Equation (19), where the link function is given by

C (ξ, η) = \frac{1}{γ} log (1 + γ ξ^{k} η_{k}),

(22)

and the corresponding generalized potential is

\begin{matrix} φ_{γ} (ξ) & = log \int_{S} {(1 + γ ξ \cdot h (s))}^{- \frac{1}{γ}} d μ (s) . \end{matrix}

(23)

Furthermore, it has been shown that this non-trivial logarithmic link function—or, equivalently, the Rényi divergence—gives rise to dual geometries of constant curvature [9]. Therefore, this divergence constitutes a natural first step in the exploration of statistical manifolds of more complex geometry.

To conclude, let us introduce the notion of Legendre derivative (This corresponds to the C-gradient in optimal transport theory (see, e.g., [9])). For given generalized potentials

φ

and

ψ

, the corresponding Legendre derivative is the operator

D_{L}

that satisfies

D_{L} φ (ξ) = η and D_{L} ψ (η) = ξ .

(24)

The functional form for

D_{L}

is determined by the corresponding link function. For example, for the case of

C (ξ, η) = ξ \cdot η

, Equations (4) and (5) show that

D_{L}

is given by the Euclidean gradient. In contrast, for a logarithmic link function as in Equation (22), one can find that the corresponding (non-Euclidean) Legendre derivative acting on a smooth function

φ

is given by

D_{L}^{(γ)} φ = \frac{1}{1 - γ ξ \cdot D φ} D φ,

(25)

with

D

denoting the Euclidean gradient.

4. Symplectic and Kähler Structures in Information Geometry

This section studies the realization of symplectic structures in statistical manifolds. This naturally leads towards considering the complexification of statistical manifolds, which enables a new avenue to develop insights about the Legendre transform. Complex manifolds are ‘bigger’ bundles that possess a richer structure benefited by greater symmetry. These complex structures are quintessential to physics, being related to the quantization of the spin and coherent states [45], entanglement [46], string theory [47], and Kähler oscillators [48].

The reasoning pursued here is that by recasting manifolds as complex structures with a higher degree of symmetry, one can obtain a more detailed understanding of their geometry and their relationship with the deformed Legendre transform. To develop this idea, we first establish a parallel between statistical manifolds and phase spaces. In doing this, it is important to note that while in statistical manifolds the dual coordinates

ξ

and

η

usually refer to the same point, in phase spaces they typically refer to canonical pairs (e.g., position and momentum) and hence correspond to different dimensions. This naturally leads to the consideration of product manifolds of two times the dimensionality of the original one.

4.1. Establishing Dynamics on Phase Space

In analytical mechanics, the Legendre transform enables the derivation of the Hamiltonian formalism from the Lagrangian, a smooth function of n generalized coordinates q, velocity

\dot{q}

, and time t. By doing this, one trades n second-order equations of motion for

2 n

first-order differential equations of the form

\frac{\partial H}{\partial p_{j}} = {\dot{q}}^{j}, \frac{\partial H}{\partial q^{k}} = - {\dot{p}}_{k} .

(26)

Notice that the transformation

(q, p) \mapsto (p, - q)

preserves the form of the above equations. This symmetry is a reflection of a rich mathematical structure that provides the foundations of classical mechanics, which we introduce in the rest of this subsection.

We start by reviewing the standard method to establish dynamics over a manifold based on the Hamiltonian formulation of classical mechanics, as described, for instance, in Refs. [2,49,50]. For this, let us consider a phase space

M

that describes the possible configurations of a system of interest. More specifically, each point in

M

has the form

z = (q^{1}, . . ., q^{n}, p_{1}, . . ., p_{n})

, with

(q^{1}, . . ., q^{n}) \in R^{n}

corresponding to a configuration manifold Q, and

(p_{1}, . . ., p_{n}) \in R^{n}

corresponding to its generalized conjugate momenta. Dynamics over the phase space

M

are established by a Hamiltonian

H : M \to R

via the following equations of motion:

\dot{z} = X_{H},

(27)

where the Hamiltonian vector field is given by

X_{H} = J D^{(0)} H (z), with J : = (\begin{matrix} 0 & 1 \\ - 1 & 0 \end{matrix})

(28)

and

D

denotes the standard gradient (see Equation (25)). In this way, dynamics are established flowing the integral curves of

X_{H}

. At any point

z \in M

there is a trajectory governed by the dynamics induced by the Hamiltonian, which is unique due to the linearity of the equations involved.

Above, the role of Equation (28)—which turns the Hamiltonian into a vector field—can be re-framed in a more principled manner via symplectic geometry [51] as follows. A symplectic form

ω

is a 2-form on

M

that is closed (

d ω = 0

) and non-degenerate (

\forall v \neq 0 \exists u : ω (v, u) \neq 0

). On a symplectic manifold (i.e., a manifold equipped with a symplectic form), the flow of the Hamiltonian H can be defined as the vector field

X_{H}

that satisfies the following relationship:

- d H = ι_{X_{H}} ω,

(29)

where

ι_{X} ω = ω (X, \cdot)

is the 1-form that results from the interior contraction of

ω

. Above,

d H

is the differential of H and the sign corresponds to a convention in the definition of the symplectic form. The fact that

ω

is non-degenerate guarantees that one can always find a unique

X_{H}

that satisfies Equation (29). Additionally, the closure of the symplectic form locally implies—by the Poincare Lemma—the existence of a tautological 1-form

θ

(also known as the canonical 1-form or symplectic potential), which satisfies the condition

ω = d θ

. This coordinate-invariant expression for

ω

emphasizes its topological nature.

Symplectic manifolds belong to equivalent classes established via symplectomorphism (i.e., diffeomorphism, which preserves the symplectic form), which are equivalent to canonical transformation in the context of analytical mechanics. The symplectic form allows us to determine a vector field from a smooth function up to diffeomorphisms that preserve the symplectic form, i.e.,

L_{X_{H}} ω = 0

. Furthermore, the geometry of the phase space gives an account of important properties of the underlying system. Indeed, while an unconstrained system may be described by a phase space of the form

M = R^{2 n}

, more complicated systems are usually reflected by more convoluted geometries. As a simple example, a pendulum is described as a phase space of the form of a cylinder, which has a flat internal geometry but a non-trivial topology. The next subsections explore the implications of phase spaces with non-zero curvature.

4.2. Symplectic Structure under the Deformed Legendre Transform

Section 3.3 shows that, from an information-geometric perspective, divergences can be used to determine the metric and connections of a manifold. In this subsection, we show how divergences also generate a symplectic 2-form, from which much of the insights from Hamiltonian mechanics can be inherited. This, in turn, allows us to study probability distributions in phase space and discuss the flow induced by divergences. Our results will show that the symplectic 2-form induced by the divergence on the phase space and the induced Hamiltonian dynamics are different from the ones induced on the product manifold when the geometry is curved—or equivalently, when the Legendre transform has been deformed.

To start, let us introduce some terminology. We will contrast structures on the cotangent bundle of statistical manifolds with structures in the product manifold

M \times M

made of pairs of the form

(p, q)

. The product manifold is often parameterized using dual coordinates as

(ξ, η) : = (ξ (p), η (q))

(as a consequence, in this section

ξ

and

η

refer to different points in the manifold, unless it is explicitly specified to be otherwise). In addition, let us use the projection operators over the left and right elements,

π_{l} (p, q) = p

and

π_{r} (p, q) = q

, to define the sub-manifolds

M_{q} : = π_{l}^{- 1} (p, q) = M \times {q} ≃ M

and

M_{p} : = π_{r}^{- 1} (p, q) = {p} \times M ≃ M

. The diagonal of the product manifold will be denoted as

Δ \subset M \times M

, being made by pairs of the form

(p, p)

.

Divergences are smooth functions mapping

M \times M

into

R

, and we are interested in the geometrical structure that such mappings induce. To investigate this, let us consider the canonical symplectic form

ω_{p}

on

T^{*} M_{p}

, which can be expressed in terms of a local chart

(U, ξ^{k}, ν_{k})

as

ω_{p} : = d ξ^{j} \land d ν_{j},

(30)

with

ν_{k}

being the conjugate coordinate to

ξ^{k}

. Note that, thanks to Darboux’s theorem [2], such canonical pairs are guaranteed to always exist locally. Let us then recast the map presented in Equation (8) as the symplectomorphism

L_{D} : M \times M \to T^{*} M_{p}

given by

L_{D} : (ξ, η) \mapsto (ξ, ν) = (ξ, \partial_{i} D (ξ, η) d ξ^{i}) .

(31)

As shown in [52,53], this map induces—via the pull-back

L_{D}^{*} ω_{p} = ω_{D}

—the following symplectic form on

M \times M

:

\begin{matrix} L_{D}^{*} ω_{p} & = L_{D}^{*} [d ξ^{i} \land d ν_{i}] \end{matrix}

(32a)

\begin{matrix} = d ξ^{i} \land d {\partial_{i} D (ξ, η)} \end{matrix}

(32b)

\begin{matrix} = d ξ^{i} \land (\partial_{i, k} D (ξ, η) d ξ + \partial_{i}^{k^{'}} D (ξ, η) d η_{k}) \end{matrix}

(32c)

\begin{matrix} = \partial_{i}^{k^{'}} D (ξ, η) d ξ^{i} \land d η_{k}, \end{matrix}

(32d)

where the vanishing of the first expression (32c) is a result of the commutativity of the second derivatives of the divergence. Note that

\partial_{i}^{k^{'}} D (ξ, η)

reduces to the Fisher metric when evaluated on

Δ

(i.e., when

ξ

and

η

are evaluated at the same element p), but is different otherwise. Importantly, the same symplectic form on

M \times M

is obtained by pulling back the canonical symplectic form

ω_{q} : = d η_{k} \land d λ^{k}

on

T^{*} M_{q}

(where

(η, λ)

form a canonical pair) in an analogous fashion, using here the symplectomorphism

R_{D} : M \times M \to T^{*} M_{q}

given by

R_{D} : (ξ, η) \mapsto (η, λ) = (η, \partial^{k} D (ξ, η) d η_{k}) .

(33)

Now that the symplectic form given by Equation (32d) has been identified as the natural one on

M \times M

, our next step is to investigate how is it influenced by the manifold’s curvature. For this, note first that if the divergence

D

is a generalized Bregman divergence, then its associated symplectic form depends solely on the link function. In effect, a direct calculation shows that for this case

ω_{D} = \partial_{i}^{k} C (ξ, η) d ξ^{i} \land d η_{k} .

(34)

This clarifies how, although identical on the cotangent bundle

T^{*} M

, the symplectic structure induced by different divergences may differ on

M \times M

.

Rényi’s Symplectic 2-Form and Flow

While the dually flat geometry established by Bregman divergences leads to a symplectic form given by

ω_{D} = d ξ^{i} \land d η_{i}

, for

γ

-curved geometry the Rényi divergence induces the following symplectic form:

ω_{D} = \frac{1}{1 + γ ξ^{i} η_{i}} (δ_{l}^{k} - \frac{γ ξ^{k} η_{l}}{1 + γ ξ^{i} η_{i}}) d η_{k} \land d ξ^{l} .

(35)

The coefficients of this symplectic form coincide with the metric tensor in Ref. [9] (Proposition 4), this time on the product manifold

M \times M

.

The symplectic form exhibited in Equation (35) is closed, as can be confirmed by a direct calculation leading to

d ω_{D} = 0

. This, in turn, implies the local existence of a corresponding tautological 1-form via Poincare Lemma, as explained in the previous section. Similar to the derivation that led to Equation (35), we define the canonical 1-form

θ_{p} = ν_{i} d ξ^{i}

on

T^{*} M_{p}

and evaluate its pull-back onto

M \times M

, yielding

θ = \frac{1}{2} \frac{η_{i} d ξ^{i} - ξ^{i} d η_{i}}{1 + γ ξ^{k} η_{k}} .

(36)

This expression, hence, characterizes the 1-form emerging from connections that describe the projective-flat geometry induced by Rényi’s divergence.

As a last step, let us leverage the symplectic form

ω_{D}

to evaluate the action of the smooth function

D_{γ}

on the product manifold

M \times M

. This function is of particular interest as it generates integral curves of constant

D

, and hence the induced flow is closed within the diagonal

Δ ≃ M

. For this purpose, let us denote as

X_{γ} = X_{γ}^{i} \partial_{ξ^{i}} + X_{γ j} \partial_{η_{j}}

the vector field generated by the observable

D_{γ}

and the corresponding symplectic form. We are interested in the vector fields that preserve the symplectic form

ω_{D}

, i.e., the vector field

X_{γ}

that satisfies

L_{X_{γ}} ω_{D} = 0

, where

L_{X_{γ}} ω_{D}

denotes the Lie derivative of

ω_{D}

in the direction of

X_{γ}

. Then, using Cartan’s magic formula one can find that

L_{X_{γ}} ω_{D} = ι_{X_{γ}} d ω_{D} + d (ι_{X_{γ}} ω_{D}) = d (ι_{X_{γ}} ω_{D}),

(37)

where the last equality is a consequence of the fact that

ω_{D}

is closed. Therefore,

L_{X_{γ}} ω_{D}

vanishes only if

X_{γ}

is Hamiltonian (29), i.e., if

X_{H}

satisfies

ι_{X_{γ}} ω_{D} + d D_{γ} = 0

. One can then determine the Rényi vector field via explicit evaluation of the interior product as follows:

\begin{matrix} - d D_{γ} & = (ι_{X_{γ}} g_{k}^{l} d η_{l}) \land d ξ^{k} - g_{k}^{l} d η_{l} \land (ι_{X_{γ}} d ξ^{k}) \end{matrix}

(38a)

\begin{matrix} = g_{k}^{l} (X_{γ l} d ξ^{k} - X_{γ}^{k} d η_{l}), \end{matrix}

(38b)

which results in a Hamiltonian flow generated by Rényi’s divergences of the form

X_{γ k} = - g_{k}^{a} \partial_{ξ^{a}} D_{γ}, X_{γ}^{k} = g_{a}^{k} \partial_{η^{a}} D_{γ} .

(39)

Then, the corresponding Rényi vector field can be found to be equal to

\begin{matrix} X_{γ} & = g_{a}^{k} \partial_{η^{a}} D_{γ} \partial_{ξ^{k}} - g_{k}^{a} \partial_{ξ^{a}} D_{γ} \partial_{η_{k}} \\ = {[η (p) - D_{L}^{(γ)} ψ (q)]}_{k} \partial_{ξ^{k}} \end{matrix}

(40a)

\begin{matrix} - {[ξ (p) - D_{L}^{(γ)} φ (q)]}^{k} \partial_{η_{k}}, \end{matrix}

(40b)

where

D_{L}^{(γ)}

is the Legendre derivative operator introduced in Section 3.4.

As mentioned above, this Rényi flow is closed within the diagonal

Δ

. Moreover, the above result implies that the flows on the diagonal follow the geodesic with respect to the primal and dual connections, which naturally satisfy Equation (24). In this way, we gain a new understanding of what deforming the exponential family implies. The squared brackets in Equation (40) imply that the set of points flowing along the integral curves at

X_{γ}

correspond to enforcing the dual coordinate pair as the Legendre derivative of the potential at the diagonal. Hence, the Bregman limit (i.e.,

γ \to 0

) leads to the dual parameterization of exponential families from regular Legendre transformation, whereas finite

γ \neq 1

leads to the deformed family of distributions obtained from Rényi’s divergence ([9], Section 4), which would describe the sets of points flowing along the integral curves at

X_{γ}

and external points diverging away from it.

4.3. Complexification of Statistical Manifolds

This section discusses some fundamental aspects of complex geometry, followed by the complexification of statistical manifolds. Then, the next section focuses on the complex structure induced by the Rényi divergence. For a more extensive treatment of the properties of complex manifolds, we refer the reader to Refs. [54,55,56].

A complex manifold can be depicted as a topological space that locally looks like

C^{n}

. One way to try building a complex manifold would be to consider a

2 n

-dimensional real manifold, and then arrange a set of coordinates

{x_{c}^{k}}

into complex combinations such as

x_{c}^{2 k - 1} + i x_{c}^{2 k}

. Unfortunately, such an arrangement is not only arbitrary, but also, more importantly, it is coordinate-dependent. In effect, additional structure on the manifold is required for it to be ‘complexifiable’.

One way to build a complex manifold is via a tensor field

J_{a}^{b}

of real components satisfying

J^{2} = - 1

, which provides a linear endomorphism

J : T_{p} M \to T_{p} M

. Notably, the diagonalization of such a tensor cannot be accomplished in a vector space of real values; hence, the coefficients of vectors in

T_{p} M

must be allowed to be complex-valued (i.e.,

T_{p}^{C} M = T_{p} M \otimes C

). By arranging

2 n

-local coordinates into complex coordinates

x^{k} + i y^{k}

, e.g., via

x^{k} = x_{c}^{2 k - 1}, y^{k} = x_{c}^{2 k}

, one can express J in complex coordinates as

J = i d z^{a} \otimes \frac{\partial}{\partial z^{a}} - i d z^{\bar{a}} \otimes \frac{\partial}{\partial z^{\bar{a}}} .

(41)

Hereon, a and

\bar{a}

are indices within

{1, \dots, n}

, with the bar being used to distinguish between holomorphic and anti-holomorphic components. The manifold

M

together with the tensor J are known as an “almost complex structure”. With the aid of J, such complexified

T_{p} M

can now be decomposed into holomorphic and anti-holomorphic parts via projection operators given by

{[P_{(\pm)}]}_{a}^{b} = \frac{1}{2} (δ_{a}^{b} \pm J_{a}^{b})

. These projection operators can be used to decompose any k-form into

(p, q)

-forms with

p + q = k

.

As suggested above, every complex manifold is also a real manifold but the converse does not always hold. A necessary and sufficient condition on J to allow a real manifold to be a complex one is given by

N_{a b}^{c} = 0

, where

N_{a b}^{c}

stands for the Nijenhuis tensor given by (note that the connections appearing from the covariant derivatives cancel out, which is why it is often found written in terms of partial derivatives in spite of being a tensor)

N_{a b}^{c} : = 2 (J_{a}^{d} \nabla_{[d} J_{b]}^{c} - J_{c}^{d} \nabla_{[d} J_{a]}^{c}),

(42)

with squared brackets denoting the antisymmetrization of indices.

Up to this point, the complex manifold

(M, J)

has not been equipped with a metric; in fact, a J-compatible metric may not exist (e.g., in Hopf manifolds). When such a metric does exist, this imposes the following compatibility conditions:

g_{μ ν} J_{ρ}^{μ} J_{σ}^{ν} = g_{ρ σ} and \nabla_{μ} J_{σ}^{ν} = 0 .

(43)

The first condition above implies that the pure holomorphic and anti-holomorphic components of the metric vanish; hence,

d s^{2} = g_{μ ν} d x^{μ} \otimes d x^{ν} = g_{a \bar{b}} d z^{a} \otimes d {\bar{z}}^{\bar{a}}

is hermitian. The second condition enforces the vanishing of Nijenhuis tensor (42), not only guaranteeing complexification, but also implying that the Kähler 2-form given by

k = \frac{1}{2} g_{μ ν} J_{ρ}^{μ} d x^{ρ} \land d x^{ν} = i g_{a \bar{b}} d z^{a} \land d {\bar{z}}^{\bar{b}}

(44)

is closed, which serves as the manifold’s symplectic form. In components, Equation (44) means that

\partial_{a} g_{b \bar{c}} = \partial_{b} g_{a \bar{c}}

and

\partial_{\bar{b}} g_{a \bar{c}} = \partial_{\bar{c}} g_{a \bar{b}}

. Analogously as in (5), these expressions can be locally integrated revealing the metric

g_{a \bar{b}} = \partial_{a} \partial_{\bar{b}} K (z, \bar{z}),

(45)

with

K

being a real-valued smooth function known as the Kähler potential. This potential is not unique, as it is only determined up to the addition of a holomorphic and an anti-holomorphic function:

K (z, \bar{z}) \to K (z, \bar{z}) + U (z) + \bar{U} (\bar{z}) .

(46)

Furthermore,

K

may not be globally defined (if it were, the

ω

form would be exact and so its manifold’s volume form implies the vanishing of its integral, violating the non-degeneracy condition for the metric). In this way, a Riemannian metric as well as the symplectic form are determined by

K

, as

ω = k = \frac{i}{2} \partial \bar{\partial} K

with

{\partial, \bar{\partial}}

denoting the Dolbeault operators

\partial = d z \land \partial_{a}

and

\bar{\partial} = d \bar{z} \land \partial_{\bar{a}}

. The similarities between these expressions and the ones in Section 3.4 and Kähler’s are no coincidence, as

K

itself must convex. These similarities have been, in fact, the catalyst for the investigation of more intimate relations between the space of Kähler metrics and convexity [57] and various applications in the context of optimal transport [58].

In statistical manifolds the fundamental object is its divergence

D

, and therefore the constraints on the metric are ultimately enforced on

D

. Hence, the conditions for complexification of a manifold translate into two conditions over the corresponding divergence [59]:

(1): $\partial_{i, j^{'}} D = \partial_{j^{'}, i} D$ on $M \times M$ ;
(2): $\partial_{i, j} D + \partial_{i^{'}, j^{'}} D = κ \partial_{i, j^{'}} D$ for some $κ \in R$ .

Above, the primed indices denote differentiation with respect to

y \in M_{q}

(as opposed to regular derivatives with respect to

x \in M_{p}

). Although the first condition above is trivially satisfied when evaluated at the diagonal (as shown in Equation (11)), it is not automatic for it to hold on the whole

M \times M

manifold. Both conditions arise from the construction of an invariant arc element

d s^{2}

from the symmetric and antisymmetric parts, given by

\begin{matrix} d s^{2} & = g_{D} - i ω_{D} \\ = \partial_{i, j^{'}} D [x; y] (d x^{i} \otimes d x^{j} + d y^{i} \otimes d y^{j}) \end{matrix}

(47)

\begin{matrix} + i \partial_{i, j^{'}} D [x; y] (d x^{i} \otimes d y^{j} - d y^{i} \otimes d x^{j}), \end{matrix}

(48)

where

g_{D}

and

ω_{D}

denote the metric and symplectic form induced by the divergence

D

on

M \times M

. Note that

ω_{D}

is equivalent to the one derived in (36), while the components of

g_{D}

are expressed in Equation (45). The second condition for the complexification of a statistical manifold is motivated by the fact that, if one is interested in expressing

d s^{2}

as

\partial \bar{\partial} D

, then the condition (2) should be satisfied on

M \times M

for

κ \in R

.

Importantly, divergences that can be expressed as in Equation (16) for a given convex function

Φ

satisfy the conditions discussed above, and hence the geometries they induce are compatible with a complex structure [58,59]. These divergences induce a geometry of constant scalar curvature given by

κ = α_{-} α_{+}

with

α_{+} = - γ

and

α_{-} = 1 + γ

. Furthermore,

Φ (α_{+} x + α_{-} y)

serves as the local Kähler potential of the manifold. It is worth noting that

γ \to 0

results in a vanishing

K

and thus cannot be defined. Indeed,

γ = 0

is an excluded value for these expressions, and its limit should be previously worked out prior to complexification, as discussed in Ref. [59].

4.4. Complex Rényi Geometry under the Deformed Legendre Transform

Let us now exploit the general results presented in the previous section to deepen our understanding of the geometry induced by the Rényi divergence on statistical manifolds. The Rényi divergence

D_{γ}

belongs to the family of divergences that can be expressed as in Equation (16) using

Φ (x)

as given by

Φ (x) = log \sum_{s \in S} e^{x (s)}, with x (s) = : log p (s) .

(49)

This means that the geometry that arises from the Rényi divergence is susceptible to being complexified. Furthermore, when evaluated on arguments that correspond to probability distributions (i.e.,

x^{a} = log p^{a}

and

y^{a} = log q^{a}

) then the first two terms in Equation (16) vanish, and therefore the Rényi divergence itself serves as the Kähler potential.

Let us now show that the two conditions for complexification discussed in the previous subsection are satisfied by product manifolds

M \times M

endowed by a geometry induced by Rényi’s divergence. For this, we adopt complex coordinates

w^{a} = x^{a} + i y^{a} \in C

with

x^{a} = log p^{a}

and

y^{a} = log q^{a}

for

p, q \in M

. Using these coordinates, one finds that

\begin{matrix} - \frac{1}{κ} D_{γ} (x, y) & = log \sum_{a = 1}^{n} exp (\bar{Γ} w^{a} + Γ {\bar{w}}^{a}) \end{matrix}

(50a)

\begin{matrix} = log z_{a} {\bar{z}}^{a}, \end{matrix}

(50b)

where we are using the shorthand notations

Γ = \frac{1}{2} (α_{-} + i α_{+})

and

z^{a} = exp (w {\bar{Γ}}^{a})

. In this manner,

Φ (α_{+} x + α_{-} y)

(or, equivalently,

D_{γ} (x, y)

) can be identified as the Kähler potential for the product manifold.

The resemblance between the induced symplectic form in Equation (35) and the connections (36) at the previous section to the well-known Fubini–Study metric and its connection are suggestive of the complex-projective spaces

C P^{n}

(for an overview on

C P^{n}

spaces, please refer to Refs. [54,55,56]). Unfortunately, complexification of the local charts does not preserve the functional form of the symplectic form given by Equation (35), nor the canonical 1-form given by Equation (36). Nevertheless, special circumstances—such as

γ = 1

and a restriction to the diagonal

Δ

—do lead to

C P^{n}

upon complexification. Disregarding the pure holomorphic and anti-holomorphic functions of the divergence, the link function of the deformed Legendre transform can be directly read as the Kähler potential as follows:

K (z, \bar{z}) = C (z, \bar{z}) = log (1 + z_{a} {\bar{z}}^{a}),

(51)

hence generating the Fubini–Study metric given by

g_{a \bar{b}} = \frac{1}{1 + z_{a} {\bar{z}}^{a}} (δ_{a \bar{b}} - \frac{z^{a} {\bar{z}}^{b}}{(1 + z_{a} {\bar{z}}^{a})}) .

(52)

The case of complex dimension

n = {dim}_{C} M = 1

(two real dimensions), that is,

M = C P^{1} \subset C^{2}

, is of particular interest to physical systems. Indeed, from a group-theoretic perspective, this manifold corresponding to the coset group

S U (2) / U (1)

(isomorphic to the Riemann sphere

S^{2} ≃ C P^{1}

) is crucial for the formulation of spin coherent states [45,60] and the geometric quantization of the spin [49]. In addition,

C P^{1}

describes pure quantum states whose direct product enables a nice geometric formulation of many phenomena of interest, including entangled systems [46].

The connection on this manifold corresponds to the canonical 1-form, which is now determined by its Kähler potential

A = \frac{i}{2} (\partial - \bar{\partial}) K = \frac{i}{2} \frac{z_{a} d {\bar{z}}^{a} - {\bar{z}}_{a} d z^{a}}{1 + z_{a} {\bar{z}}^{a}}

(53)

via the Dolbeault operators (here the index takes only one entry

a = 1

, with trivial generalization to

C P^{n}

). Note that this gauge-field is consistent with the expression obtained for the connection 1-form found in Equation (35).

Let us now show how a quantization of the 2-sphere restricts the allowed values for the Rényi parameter

γ

. As Poincare’s Lemma tells us, every closed form is locally exact, and hence the existence of closed forms failing to be exact reflects some non-trivial aspect of the topology of the manifold. This feature is captured by cohomology classes

H^{k} (M, R)

, whose members are closed yet globally not exact k-forms. In this sense, the Kähler form belongs to

H^{2} (M, R)

. The single-valuedness of points on the manifold would require the

ω_{D}

to belong to a cohomology class

H^{2} (M, R)

. Therefore, its symplectic two-form must be an integer multiple of

ω_{D}

. Hence, the covariant derivative is

\nabla_{a} = \partial_{a} - i k A_{z}

with

k \in Z

(not to be confused with the manifold’s complex dimension n), and the same holds for its anti-holomorphic counterpart. The holomorphic polarization (see Appendix A) imposes the condition

\nabla_{\bar{a}} ψ = 0

for

ψ

wave function, a function whose squared module gives the probability density, closely resembling wave functions in quantum mechanics. This results in

(\partial_{\bar{z}} + \frac{k}{2} \frac{z^{a}}{1 + z_{a} {\bar{z}}^{a}}) ψ = 0 .

(54)

This implicit equation is solved by physical solutions

ψ_{phys}

of the form

ψ_{phys} = exp (- \frac{k}{2} log (1 + z_{a} {\bar{z}}^{a})) f (z),

(55)

with

f (z)

being a holomorphic function. The resulting probability density

| ψ_{phys} |^{2}

is given by

P (z) = \frac{{| f (z) |}^{2}}{{(1 + z_{a} {\bar{z}}^{a})}^{k}} .

(56)

The holomorphic function

f (z)

can be expanded on the basis

{1, z, z^{2}, . . ., z^{k}}

, as higher powers would imply

P (z)

to diverge; hence, a Hilbert space of finite dimension as

ψ_{phys}

is defined over the 2-sphere.

Just as holomorphic polarization for

γ = 0

results in exponential family distributions (Appendix A), one recovers the Rényi maximum entropy distributions as a polarization of the manifold for other values of

γ

. Moreover, by identifying

γ = \frac{1}{k}

, one realizes (keeping the sign of

γ

) that

k \in Z_{+}

introduces the restriction

γ \in (0, 1]

, which corresponds to

α \in (- 1, 1]

and reflects a positive curvature, as discussed in Ref. [10]. Although ruled out by the polarization, it is interesting to note that considering

γ \notin (0, 1]

would result in the manifold having hyperbolic topology and becoming non-compact, hence not being susceptible to complexification. These results establish

γ \in (0, 1]

as values of special physical significance:

γ = 1

for spin coherent states [45], worldline formalism [61], Kähler oscillators [48], and entanglement [46], and other values in

γ \in (0, 1]

for systems described through the geometric quantization framework. Notably, this range does not include

γ = 0

, which corresponds to conventional dually flat geometry and the Shannon entropy.

5. Conclusions

The Legendre transform, a fundamental piece of classic and contemporary physics, has a direct but non-trivial correspondence with the dually flat geometry of statistical manifolds induced by Shannon’s entropy and the Kullback–Leibler divergence. This paper explores how deformations of the Legendre transform induce a departure from this regime and has multiple consequences on symplectic geometry and complexification. Taken together, these results provide some first steps towards a novel, rigorous, and encompassing understanding of physical systems that are not well-described by classic information-theoretic quantities.

The role of the Legendre transform on analytical mechanics differs from that in information geometry; in the latter, dual coordinates refer to different descriptions of the same point, whereas in the former, they refer to an isomorphism between the tangent and cotangent bundles. In flat geometry the symplectic form of the cotangent bundle is equivalent to a canonical area form at the product manifold. In contrast, our results show that this equivalence is broken if the manifold is curved. Interestingly, this implies that a deformation of the regular Legendre transform results in the failure of the natural coordinates to form a canonical pair. Furthermore, an analysis of the deformed symplectic form and flow that arises in curved manifolds reveals a new understanding of the family of maximum Rényi entropy distributions, which are found to form sets of points flowing along the integral curves of the flow.

The departure of the symplectic form of the product manifold from the cotangent bundle provides a promising lead to study coupled physical systems, with non-canonical coordinates—like the pair induced by the Rényi geometry—being subjects of special interest. For instance, there have been studies on the consequences of deformations in the symplectic form in field theory [62] and in

C P^{n}

Kähler oscillators, where deformations to the symplectic structure via magnetic field are explored [48]. Other related phenomena have been studied in Fermi liquids under an external magnetic field, where the the magnetic field couples to Berry’s curvature, deforming the symplectic form. Such deformations have been shown to have strong consequences for observables, as the invariant phase volume is modified via a topological invariant [63,64]. An interesting avenue for future research is to investigate if there are divergences that can recapitulate these deformations, providing a mathematical scaffolding for the study of such systems.

In this work we have established a broad range of nonzero

γ

values relevant from more than just a mathematical perspective. Both symplectic topology and Kähler manifolds are sensitive to the topology rather than local changes in geometry. Furthermore, they are sensitive to the physical systems to which they now connect. In particular, our results show that

γ = 1

corresponds to a special case that is associated with the

C P^{1}

manifolds relevant across various fields such as coherent states [45], worldline formalism [61], Kähler oscillators [48] and entanglement [46], to name a few. Via geometric quantization methods, our results show that holomorphic polarization leads to

γ \in (0, 1]

. This reveals a further array of values of interest outside of the conventional

γ = 0

that characterizes the conventional dually flat Shannon systems.

The results presented here establish a first step in uncovering the consequences that the relationship between generalized Legendre transforms and curved statistical manifolds have for physical systems. We hope that this investigation may foster future work on these important implications, which may reveal other hidden threads connecting seemingly dissimilar approaches, such as the one revealed here relating non-Shannon entropies and non-canonical coordinates. Such investigations may lead towards a principled and unified understanding of physical systems that are not well-described by traditional approaches, providing solid foundations to support and guide some of today’s effective but ad hoc procedures of analysis.

Author Contributions

Conceptualization, P.A.M. and F.E.R.; Investigation, P.A.M. and J.K.; Writing—original draft, P.A.M., J.K. and F.E.R.; Writing—review & editing, F.E.R. All authors have read and agreed to the published version of the manuscript.

Funding

Open Access Funding by the Austrian Science Fund (FWF).

Acknowledgments

P.A.M. acknowledges support by JSPS KAKENHI Grant Number 23K168550001. J.K. acknowledges support by the Austrian Science Fund (FWF) project No. P 34994. F.R. was supported by the Fellowship Programme of the Institute of Cultural and Creative Industries of the University of Kent.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Complex Polarizations

This appendix illustrates the method of holomorphic polarization, which establishes an intimate relation between link functions and natural families. For a given Kähler manifold one can choose a polarization. A holomorphic polarization has the consequence that physical states are represented as holomorphic functions, thereby generalizing Bargmann–Segal’s (Fock) spaces that are relevant to coherent states. The complex polarization is a condition determined by

\nabla_{\bar{a}} ψ = (\partial_{\bar{a}} + \frac{1}{2} \partial_{\bar{a}} K (z, \bar{z})) ψ = 0,

(A1)

where the connection is determined by the Kähler potential over the manifold. This polarization implies that the commutator

[\nabla_{\bar{a}}, \nabla_{\bar{b}}] = 0

, and hence the system described at (A1) is integrable. Its general solution is given by

ψ_{phys} = exp [- \frac{1}{2} K (z, \bar{z})] ϕ (z) .

(A2)

In the context of statistical manifolds,

K (z, \bar{z})

corresponds to a link function

C (z, \bar{z})

. Therefore, Equation (A2) corresponds to a natural family of distributions, e.g., the flat geometry

C^{n}

is described by

C (z, \bar{z}) = z^{a} {\bar{z}}_{\bar{a}}

, which leads to the exponential family, whereas a link function of the form of Equation (51) yields Rényi’s natural family. The resulting physical Hilbert space is

H_{phys} = \{ϕ (z) |\int_{M} {| ϕ |}^{2} e^{- C (z, \bar{z})} ω^{n} < \infty\},

(A3)

where

ω^{n}

denotes the manifold’s volume form. In other words, one considers square-integrable global sections that are covariantly constant along

\nabla_{\bar{a}}

.

References

Rockafellar, R.T. Convex Analysis; Princeton University Press: Princeton, NJ, USA, 1997; Volume 11. [Google Scholar]
McDuff, D.; Salamon, D. Introduction to Symplectic Topology; Oxford University Press: Oxford, UK, 2017; Volume 27. [Google Scholar]
Jackson, D.M.; Kempf, A.; Morales, A.H. A robust generalization of the Legendre transform for QFT. J. Phys. A Math. Theor. 2017, 50, 225201. [Google Scholar] [CrossRef]
Krupková, O.; Smetanová, D. Legendre transformation for regularizable Lagrangians in field theory. Lett. Math. Phys. 2001, 58, 189–204. [Google Scholar] [CrossRef]
Amari, S.I. Information Geometry and Its Applications; Springer: Berlin/Heidelberg, Germany, 2016; Volume 194. [Google Scholar]
Amari, S.I. Information geometry on hierarchy of probability distributions. IEEE Trans. Inf. Theory 2001, 47, 1701–1711. [Google Scholar] [CrossRef]
Ohara, A. Geometric study for the Legendre duality of generalized entropies and its application to the porous medium equation. Eur. Phys. J. B 2009, 70, 15–28. [Google Scholar] [CrossRef]
Scarfone, A.M.; Matsuzoe, H.; Wada, T. Information geometry of κ-exponential families: Dually-flat, Hessian and Legendre structures. Entropy 2018, 20, 436. [Google Scholar] [CrossRef]
Wong, T.K.L. Logarithmic divergences from optimal transport and Rényi geometry. Inf. Geom. 2018, 1, 39–78. [Google Scholar] [CrossRef]
Morales, P.A.; Rosas, F.E. Generalization of the maximum entropy principle for curved statistical manifolds. Phys. Rev. Res. 2021, 3, 033216. [Google Scholar] [CrossRef]
Wong, T.K.L.; Zhang, J. Tsallis and Rényi Deformations Linked via a New λ-Duality. IEEE Trans. Inf. Theory 2022, 68, 5353–5373. [Google Scholar] [CrossRef]
Stéphan, J.M.; Inglis, S.; Fendley, P.; Melko, R.G. Geometric mutual information at classical critical points. Phys. Rev. Lett. 2014, 112, 127204. [Google Scholar] [CrossRef]
Stéphan, J.M. Shannon and Rényi mutual information in quantum critical spin chains. Phys. Rev. B 2014, 90, 045424. [Google Scholar] [CrossRef]
Dong, X. The Gravity Dual of Renyi Entropy. Nat. Commun. 2016, 7, 12472. [Google Scholar] [CrossRef] [PubMed]
Barrella, T.; Dong, X.; Hartnoll, S.A.; Martin, V.L. Holographic entanglement beyond classical gravity. J. High Energy Phys. 2013, 9, 109. [Google Scholar] [CrossRef]
Jizba, P.; Korbel, J. Maximum entropy principle in statistical inference: Case for non-Shannonian entropies. Phys. Rev. Lett. 2019, 122, 120601. [Google Scholar] [CrossRef] [PubMed]
Iaconis, J.; Inglis, S.; Kallin, A.B.; Melko, R.G. Detecting classical phase transitions with Renyi mutual information. Phys. Rev. B 2013, 87, 195134. [Google Scholar] [CrossRef]
Zaletel, M.P.; Bardarson, J.H.; Moore, J.E. Logarithmic Terms in Entanglement Entropies of 2D Quantum Critical Points and Shannon Entropies of Spin Chains. Phys. Rev. Lett. 2011, 107, 020402. [Google Scholar] [CrossRef]
Jizba, P.; Arimitsu, T. The world according to Rényi: Thermodynamics of multifractal systems. Ann. Phys. 2004, 312, 17–59. [Google Scholar] [CrossRef]
Jizba, P.; Arimitsu, T. Observability of Rényi’s Entropy. Phys. Rev. E 2004, 69, 026128. [Google Scholar] [CrossRef]
Morales, P.A.; Korbel, J.; Rosas, F.E. Thermodynamics of exponential Kolmogorov-Nagumo averages. arXiv 2023, arXiv:2302.06959. [Google Scholar]
Zia, R.K.; Redish, E.F.; McKay, S.R. Making sense of the Legendre transform. Am. J. Phys. 2009, 77, 614–622. [Google Scholar] [CrossRef]
Boyd, S.; Boyd, S.P.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Villani, C. Optimal Transport: Old and New; Springer: Berlin/Heidelberg, Germany, 2009; Volume 338. [Google Scholar]
Amari, S.I. Information geometry. Jpn. J. Math 2021, 16, 1–48. [Google Scholar] [CrossRef]
Vitagliano, V.; Sotiriou, T.P.; Liberati, S. The dynamics of metric-affine gravity. Ann. Phys. 2011, 326, 1259–1273, Erratum in Ann. Phys. 2013, 329, 186–187.. [Google Scholar] [CrossRef]
Vitagliano, V. The role of nonmetricity in metric-affine theories of gravity. Class. Quant. Grav. 2014, 31, 045006. [Google Scholar] [CrossRef]
Amari, S.I.; Ikeda, S.; Shimokawa, H. Information geometry of α-projection in mean-field approximation. In Recent Developments of Mean Field Approximation; Opper, M., Saad, D., Eds.; MIT Press: Cambridge, UK, 2000. [Google Scholar]
Amari, S.I.; Cichocki, A. Information geometry of divergence functions. Bull. Pol. Acad. Sci. Tech. Sci. 2010, 58, 183–195. [Google Scholar] [CrossRef]
Eguchi, S. Second order efficiency of minimum contrast estimators in a curved exponential family. Ann. Stat. 1983, 11, 793–803. [Google Scholar] [CrossRef]
Matumoto, T. Any statistical manifold has a contrast function—On the C3-functions taking the minimum at the diagonal of the product manifold. Hiroshima Math. J 1993, 23, 327–332. [Google Scholar] [CrossRef]
Ay, N.; Amari, S.I. A novel approach to canonical divergences within information geometry. Entropy 2015, 17, 8111–8129. [Google Scholar] [CrossRef]
Liese, F.; Vajda, I. On Divergences and Informations in Statistics and Information Theory. IEEE Trans. Inf. Theory 2006, 52, 4394–4412. [Google Scholar] [CrossRef]
Amari, S.I. α-Divergence is Unique, Belonging to Both f-Divergence and Bregman Divergence Classes. IEEE Trans. Inf. Theor. 2009, 55, 4925–4931. [Google Scholar] [CrossRef]
Chentsov, N. Statistical Decision Rules and Optimal Inference; Transl. Math.; Monographs, American Mathematical Society: Providence, RI, USA, 1982. [Google Scholar]
Ay, N.; Jost, J.; Vân Lê, H.; Schwachhöfer, L. Information geometry and sufficient statistics. Probab. Theory Relat. Fields 2015, 162, 327–364. [Google Scholar] [CrossRef]
Vân Lê, H. The uniqueness of the Fisher metric as information metric. Ann. Inst. Stat. Math. 2017, 69, 879–896. [Google Scholar]
Dowty, J.G. Chentsov’s theorem for exponential families. Inf. Geom. 2018, 1, 117–135. [Google Scholar] [CrossRef]
Cencov, N.N. Statistical Decision Rules and Optimal Inference; Number 53; American Mathematical Soc.: Providence, RI, USA, 2000. [Google Scholar]
Amari, S.I. Differential geometry of curved exponential families-curvatures and information loss. Ann. Stat. 1982, 10, 357–385. [Google Scholar] [CrossRef]
Zhang, J. Divergence function, duality, and convex analysis. Neural Comput. 2004, 16, 159–195. [Google Scholar] [CrossRef] [PubMed]
Pal, S.; Wong, T.K.L. Exponentially concave functions and a new information geometry. Ann. Probab. 2018, 46, 1070–1113. [Google Scholar] [CrossRef]
Rényi, A. Selected Papers of Alfréd Rényi; Number 2 in Selected Papers of Alfréd Rényi, Akadémiai Kiadó; JSTOR: New York, NJ, USA, 1976. [Google Scholar]
Valverde-Albacete, F.; Peláez-Moreno, C. The Case for Shifting the Rényi Entropy. Entropy 2019, 21, 46. [Google Scholar] [CrossRef]
Kochetov, E. SU (2) coherent-state path integral. J. Math. Phys. 1995, 36, 4667–4679. [Google Scholar] [CrossRef]
Brody, D.C.; Hughston, L.P. Geometric quantum mechanics. J. Geom. Phys. 2001, 38, 19–53. [Google Scholar] [CrossRef]
Gawȩdzki, K. Non-compact WZW conformal field theories. In New Symmetry Principles in Quantum Field Theory; Springer: Berlin/Heidelberg, Germany, 1992; pp. 247–274. [Google Scholar]
Bellucci, S.; Nersessian, A. (Super) oscillator on CP^N and a constant magnetic field. Phys. Rev. D 2005, 71, 089901, Erratum in Phys. Rev. D 2003, 67, 065013.. [Google Scholar] [CrossRef]
Woodhouse, N.M.J. Geometric Quantization; Oxford University Press: Oxford, UK, 1997. [Google Scholar]
Bates, S.; Weinstein, A. Lectures on the Geometry of Quantization; American Mathematical Soc.: Providence, RI, USA, 1997; Volume 8. [Google Scholar]
Arnold, V.I. Mathematical Methods of Classical Mechanics; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013; Volume 60. [Google Scholar]
Zhang, J.; Li, F. Symplectic and Kähler structures on statistical manifolds induced from divergence functions. In Proceedings of the International Conference on Geometric Science of Information, Paris, France, 28–30 August 2013; Springer: Berlin/Heidelberg, Germany, 2013; pp. 595–603. [Google Scholar]
Leok, M.; Zhang, J. Connecting information geometry and geometric mechanics. Entropy 2017, 19, 518. [Google Scholar] [CrossRef]
Candelas, P. Lectures on complex manifolds. In Superstrings and Grand Unification; 1988; Available online: https://inis.iaea.org/search/search.aspx?orig_q=RN:23060635 (accessed on 8 April 2023).
Bouchard, V. Lectures on complex geometry, Calabi-Yau manifolds and toric geometry. arXiv 2007, arXiv:0702063. [Google Scholar]
Nakahara, M. Geometry, Topology and Physics; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
Berndtsson, B. Convexity on the space of Kähler metrics. Ann. Fac. Des Sci. Toulouse Math. 2013, 22, 713–746. [Google Scholar] [CrossRef]
Khan, G.; Zhang, J. The Kähler geometry of certain optimal transport problems. Pure Appl. Anal. 2020, 2, 397–426. [Google Scholar] [CrossRef]
Zhang, J. Divergence functions and geometric structures they induce on a manifold. In Geometric Theory of Information; Springer: Berlin/Heidelberg, Germany, 2014; pp. 1–30. [Google Scholar]
Zhang, W.M.; Feng, D.H.; Gilmore, R. Coherent states: Theory and some applications. Rev. Mod. Phys. 1990, 62, 867–927. [Google Scholar] [CrossRef]
Copinger, P.; Morales, P. Schwinger pair production in SL(2, $C$ ) topologically nontrivial fields via non-Abelian worldline instantons. Phys. Rev. D 2021, 103, 036004. [Google Scholar] [CrossRef]
Kazinski, P.O. Stochastic deformation of a thermodynamic symplectic structure. Phys. Rev. E 2009, 79, 011105. [Google Scholar] [CrossRef] [PubMed]
Duval, C.; Horvath, Z.; Horvathy, P.A.; Martina, L.; Stichel, P. Berry phase correction to electron density in solids and ‘exotic’ dynamics. Mod. Phys. Lett. B 2006, 20, 373–378. [Google Scholar] [CrossRef]
Son, D.T.; Yamamoto, N. Berry Curvature, Triangle Anomalies, and the Chiral Magnetic Effect in Fermi Liquids. Phys. Rev. Lett. 2012, 109, 181602. [Google Scholar] [CrossRef]

Figure 1. A graphical representation of the Legendre transform and its deformations. (Left:) While the standard Legendre transform acts on concave duals, the deformed one acts between C-concave functions. Each transform brings elements of one space to the other. Please note that while Section 2 presents the classical view of Legendre transforms acting over convex functions, the rest of this work follows Ref. [9] in focusing on concave functions. (Right:) The symmetry that governs the algebraic relationships between convex dual functions and dual coordinates, which is mediated by the Legendre derivative operator

D_{L}

, which differs from the standard Euclidean gradient when the transform is deformed.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Geometric Structures Induced by Deformations of the Legendre Transform

Abstract

1. Introduction

2. Preliminaries

3. Legendre Transform in Information Geometry

3.1. The Dual Structure of Statistical Manifolds

3.2. Dually Flat Geometry, Bregman Divergences, and the Legendre Transform

3.3. Divergences as a General Tool to Establish Geometries

3.4. Generalized Legendre Transforms as a Natural Way to Describe Curved Manifolds

4. Symplectic and Kähler Structures in Information Geometry

4.1. Establishing Dynamics on Phase Space

4.2. Symplectic Structure under the Deformed Legendre Transform

Rényi’s Symplectic 2-Form and Flow

4.3. Complexification of Statistical Manifolds

4.4. Complex Rényi Geometry under the Deformed Legendre Transform

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Complex Polarizations

References

Article Metrics

Citations

Article Access Statistics