Entropy 2014, 16(5), 2472-2487; doi:10.3390/e16052472

Article
F-Geometry and Amari’s α−Geometry on a Statistical Manifold
K. V. Harsha * and K S Subrahamanian Moosath *
Indian Institute of Space Science and Technology, Department of Space, Government of India, Valiamala P.O, Thiruvananthapuram-695547, Kerala, India
*
Authors to whom correspondence should be addressed; E-Mails: harsha.11@iist.ac.in (K.V.H.); smoosath@iist.ac.in (K.S.S.M.); Tel.: +91-95-6736-0425 (K.V.H.); +91-94-9574-3148 (K.S.S.M.).
Received: 13 December 2013; in revised form: 21 April 2014 / Accepted: 25 April 2014 /
Published: 6 May 2014

Abstract

: In this paper, we introduce a geometry called F-geometry on a statistical manifold S using an embedding F of S into the space ℝX of random variables. Amari’s α−geometry is a special case of F −geometry. Then using the embedding F and a positive smooth function G, we introduce (F, G)−metric and (F, G)−connections that enable one to consider weighted Fisher information metric and weighted connections. The necessary and sufficient condition for two (F, G)−connections to be dual with respect to the (F, G)−metric is obtained. Then we show that Amari’s 0−connection is the only self dual F −connection with respect to the Fisher information metric. Invariance properties of the geometric structures are discussed, which proved that Amari’s α−connections are the only F −connections that are invariant under smooth one-to-one transformations of the random variables.
Keywords:
embedding; Amari’s α−connections; F−metric; F−connections; (F, G)−metric; (F, G)−connections; invariance

1. Introduction

Geometric study of statistical estimation has opened up an interesting new area called the Information Geometry. Information geometry achieved a remarkable progress through the works of Amari [1,2], and his colleagues [3,4]. In the last few years, many authors have considerably contributed in this area [59]. Information geometry has a wide variety of applications in other areas of engineering and science, such as neural networks, machine learning, biology, mathematical finance, control system theory, quantum systems, statistical mechanics, etc.

A statistical manifold of probability distributions is equipped with a Riemannian metric and a pair of dual affine connections [2,4,9]. It was Rao [10] who introduced the idea of using Fisher information as a Riemannian metric in the manifold of probability distributions. Chentsov [11] introduced a family of affine connections on a statistical manifold defined on finite sets. Amari [2] introduced a family of affine connections called α−connections using a one parameter family of functions, the α−embeddings. These α−connections are equivalent to those defined by Chentsov. The Fisher information metric and these affine connections are characterized by invariance with respect to the sufficient statistic [4,12] and play a vital role in the theory of statistical estimation. Zhang [13] generalized Amari’s α−representation and using this general representation together with a convex function he defined a family of divergence functions from the point of view of representational and referential duality. The Riemannian metric and dual connections are defined using these divergence functions.

In this paper, Amari’s idea of using α−embeddings to define geometric structures is extended to a general embedding. This paper is organized as follows. In Section 2, we define an affine connection called F −connection and a Riemannian metric called F −metric using a general embedding F of a statistical manifold S into the space of random variables. We show that F −metric is the Fisher information metric and Amari’s α−geometry is a special case of F −geometry. Further, we introduce (F, G)−metric and (F, G)−connections using the embedding F and a positive smooth function G.

In Section 3, a necessary and sufficient condition for two (F, G)−connections to be dual with respect to the (F, G)−metric is derived and we prove that Amari’s 0−connection is the only self dual F −connection with respect to the Fisher information metric. Then we prove that the set of all positive finite measures on X, for a finite X, has an F −affine manifold structure for any embedding F. In Section 4, invariance properties of the geometric structures are discussed. We prove that the Fisher information metric and Amari’s α−connections are invariant under both the transformation of the parameter and the transformation of the random variable. Further we show that Amari’s α−connections are the only F −connections that are invariant under both the transformation of the parameter and the transformation of the random variable.

Let (X, B) be a measurable space, where X is a non-empty subset of ℝ and B is the σ-field of subsets of X. Let ℝX be the space of all real valued measurable functions defined on (X, B). Consider an n−dimensional statistical manifold S = {p(x; ξ) / ξ = [ξ1, …, ξn] ∈ E ⊆ ℝn}, with coordinates ξ = [ξ1, …, ξn], defined on X. S is a subset of P(X), the set of all probability measures on X given by

P ( X ) : = { p : X / p ( x ) > 0 ( x X ) ; X p ( x ) d x = 1 } .

The tangent space to S at a point pξ is given by

T ξ ( S ) = { i = 1 n α i i / α i } where  i = ξ i .

Define (x; ξ) = log p(x; ξ) and consider the partial derivatives { ξ i = i ; i = 1 , . , n } which are called scores. For the statistical manifold S, il’s are linearly independent functions in x for a fixed ξ. Let T ξ 1 ( S ) be the n-dimensional vector space spanned by n functions {i; i = 1, …., n in x. So

T ξ 1 ( S ) = { i = 1 n A i i / A i } .

Then there is a natural isomorphism between these two vector spaces Tξ(S) and T ξ 1 ( S ) given by

i T ξ ( S ) i ( x ; ξ ) T ξ 1 ( S ) .

Obviously, a tangent vector A = i = 1 n A i i T ξ ( S ) corresponds to a random variable A ( x ) = i = 1 n A i i ( x ; ξ ) T ξ 1 ( S ) having the same components Ai. Note that Tξ(S) is the differentiation operator representation of the tangent space, while T ξ 1 ( S ) is the random variable representation of the same tangent space. The space T ξ 1 ( S ) is called the 1-representation of the tangent space. Let A and B be two tangent vectors in Tξ(S) and A(x) and B(x) be the 1–representations of A and B respectively. We can define an inner product on each tangent space Tξ(S) by

g ξ ( A , B ) = < A , B > ξ = E ξ [ A ( x ) B ( x ) ] = A ( x ) B ( x ) p ( x ; ξ ) d x .

Especially the inner product of the basis vectors i and j is

g i j ( ξ ) = < i , i > ξ = E ξ [ i j ] = i ( x ; ξ ) j ( x ; ξ ) p ( x ; ξ ) d x .

Note that g = <, > defines a Riemannian metric on S called the Fisher information metric. On the Riemannian manifold (S, g =<, >), define n3 functions Γijk by

Γ i j k ( ξ ) = E ξ [ i j ( x ; ξ ) ) ( k ( x ; ξ ) ) ] .

These functions Γijk uniquely determine an affine connection ∇ on S by

Γ i j k ( ξ ) = < i j , k > ξ .

∇ is called the 1−connection or the exponential connection.

Amari [2] defined a one parameter family of functions called the α−embeddings given by

L α ( p ) = { 2 1 α p 1 α 2 α 1 log p α = 1

Using these, we can define n3 functions Γ i j k α by

Γ i j k α = i j L α ( p ( x ; ξ ) ) k L α ( p ( x ; ξ ) ) d x

These Γ i j k α uniquely determine affine connections ∇α on the statistical manifold S by

Γ i j k α = < i α j , k >

which are called αconnections.

2. F−Geometry of a Statistical Manifold

On a statistical manifold S, the Fisher information metric and exponential connection are defined using the log embedding. In a similar way, α−connections are defined using a one parameter family of functions, the α−embeddings. In general, we can give other geometric structures on S using different embeddings of the manifold S into the space of random variables ℝX.

Let F: (0, ) → ℝ be an injective function that is at least twice differentiable. Thus we have F′(u) ≠ 0, ∀ u ∈ (0, ∞). F is an embedding of S into ℝX that takes each p(x; ξ) → F (p(x; ξ)). Denote F (p(x; ξ)) by F (x; ξ) and iF can be written as

i F ( x ; ξ ) = p ( x ; ξ ) F ( p ( x ; ξ ) ) i ( p ( x ; ξ ) ) .

It is clear that iF (x; ξ); i = 1, …, n are linearly independent functions in x for fixed ξ since i(p(x; ξ)); i = 1, …, n are linearly independent. Let T F ( p ξ ) F (S) be the n-dimensional vector space spanned by n functions iF; i = 1, …., n in x for fixed ξ. So

T F ( p ξ ) F ( S ) = { i = 1 n A i i F / A i }

Let the tangent space T F ( p ξ ) (F (S)) to F (S) at the point F(pξ) be denoted by T ξ F ( S ). There is a natural isomorphism between the two vector spaces Tξ(S) and T ξ F ( S ) given by

i T ξ ( S ) i F ( x ; ξ ) T ξ F ( S ) .

T ξ F ( S ) is called the F−representation of the tangent space Tξ(S).

For any A = i = 1 n A i i T ξ ( S ), the corresponding A ( x ) = i = 1 n A i i F T ξ F ( S ) is called the F–representation of the tangent vector A and is denoted by AF(x). Note that T ξ F ( S ) T F ( p ξ ) ( X ). Since ℝX is a vector space, its tangent space T F ( p ξ ) ( X ) can be identified with ℝX. So T ξ F ( S ) ( X ).

Definition 2.1

F−expectation of a random variable f with respect to the distribution p(x; ξ) is defined as

E ξ F ( f ) = f ( x ) 1 p ( F ( p ) ) 2 d x .

We can use this F−expectation to define an inner product in ℝX by

< f , g > ξ F = E ξ F [ f ( x ) g ( x ) ] ,

which induces an inner product on Tξ(S) by

< A , B > ξ F = E ξ F [ A F ( x ) B F ( x ) ] ; A , B T ξ ( S ) .

Proposition 2.2

The induced metric <, >F on S is the Fisher information metric g =<, > on S.

Proof

For any basis vectors i, ∂jTξ(S)

Entropy 16 02472f1

So the metric <, >F on S induced by the embedding F of S into ℝX is the Fisher information metric g =<, > on S.

We can induce a connection on S using the embedding F.

Let π | p ξ F : X T ξ F ( S ) be the projection map.

Definition 2.3

The connection induced by the embedding F on S, the F−connection, is defined as

Entropy 16 02472f2

where [gmn(ξ)] is the inverse of the Fisher information matrix G(ξ) = [gmn(ξ)]. Note that the F−connections are symmetric.

Lemma 2.4

The F −connection and its components can be written in terms of scores as

i F j = n m g m m E ξ [ ( i j + ( 1 + p F ( p ) F ( p ) ) i i ) ( m ) ] n

and

Γ i j k F ( ξ ) = E ξ [ ( i j + ( 1 + p F ( p ) F ( p ) ) i i ) ( k ) ]

Proof

From Equation (12), we have

i j F = p F ( p ) i j + [ p F ( p ) + p 2 F ( p ) ] i j .

Therefore

Entropy 16 02472f3

Hence we can write

Entropy 16 02472f4

Then we have the Christoffel symbols of the F−connection

Γ i j F = m g m m E ξ [ ( i j + ( 1 + p F ( p ) F ( p ) ) i i ) ( m ) ]

and components of the F−connection are given by

Γ i j k F ( ξ ) = < i F j , k > ξ = E ξ [ ( i j + ( 1 + p F ( p ) F ( p ) ) i j ) ( k ) ] .

Theorem 2.5

Amari’s α−geometry is a special case of the F−geometry.

Proof

Let F(p) = Lα(p), Lα(p) is the α−embedding of Amari.

The components Γ i j k α of the α−connection are given by

Entropy 16 02472f5

From Equation (26), when F (p) = Lα(p)

we have

F ( p ) = L α ( p ) = p ( 1 + α 2 )
F ( p ) = L α ( p ) = 1 + α 2 p ( 3 + α 2 ) .

Then we get

1 + p F ( p ) F ( p ) = 1 + p L α ( p ) L α ( p ) = 1 α 2

Hence

Entropy 16 02472f6

which are the components of the α−connection. Hence F−connection reduces to α−connection. Thus we obtain that α−geometry is a special case of F−geometry. □

Remark 2.6

Burbea [14] introduced the concept of weighted Fisher information metric using a positive continuous function. We use this idea to define weighted F−metric and weighted F−connections. Let G: (0, ) → ℝ be a positive smooth function and F be an embedding, define (F, G)−expectation of a random variable with respect to the distribution pξ as

E ξ F , G ( f ) = f ( x ) G ( p ) p ( F ( p ) ) 2 d x .

Define (F, G)−metric < , > ξ F , G in T p ξ ( S ) by

Entropy 16 02472f7

Define (F, G)−connection as

Entropy 16 02472f8

When G(p) = 1, (F, G)−connection reduces to the F−connection and the metric <, >F,G reduces to the Fisher information metric. This is a more general way of defining Riemannian metrics and affine connections on a statistical manifold.

3. Dual Affine Connections

Definition 3.1

Let M be a Riemannian manifold with a Riemannian metric g. Two affine connections,and on the tangent bundle are said to be dual connections with respect to the metric g if

Z g ( X , Y ) = g ( Z X , Y ) + g ( X , Z * Y )

holds for any vector fields X, Y, Z on M.

Theorem 3.2

Let F, H be two embeddings of statistical manifold S into the spaceX of random variables. Let G be a positive smooth function on (0, ). Then the (F, G)−connectionF,G and the (H, G)−connectionH,G are dual connections with respect to the (F, G)−metric iff the functions F and H satisfy

H ( p ) = G ( p ) p F ( p )

We call such an embedding H as a G−dual embedding of F.

The components of the dual connectionH,G can be written as

Γ i j k H , G = ( i j + ( p G ( p ) G ( p ) p F ( p ) F ( p ) ) i j ) k G ( p ) p d x .

Proof

F,G and ∇H,G are dual connections with respect to the G−metric means,

k < i , j > F , G = < k F , G i , j > F , G + < i , k H , G j > F , G .

for any basis vectors i, ∂j, ∂kTξ(S).

Entropy 16 02472f9
Entropy 16 02472f10

Then the condition (38) holds iff

[ 2 + p F ( p ) F ( p ) + p H ( p ) H ( p ) ] i j k p G ( p ) d x = [ 1 + p G ( p ) G ( p ) ] i j k p G ( p ) d x
[ 2 + p F ( p ) F ( p ) + p H ( p ) H ( p ) ] = 1 + p G ( p ) G ( p ) .
1 + p H ( p ) H ( p ) = p G ( p ) G ( p ) p F ( p ) F ( p )
H ( p ) H ( p ) = G ( p ) G ( p ) F ( p ) F ( p ) 1 p H ( p ) = G ( p ) p F ( p ) .

Hence ∇F,G and ∇H,G are dual connections with respect to the (F, G)−metric iff Equation (36) holds. From Equation (43), we can rewrite the components of dual connection ∇H,G as

Γ i j k H , G = ( i j + ( p G ( p ) G ( p ) p F ( p ) F ( p ) ) i j ) k G ( p ) p d x .

Corollary 3.3

Amari’s 0−connection is the only self dual F−connection with respect to the Fisher information metric.

Proof

From Theorem 3.2, for G(p) = 1 the F−connection ∇F and the H−connection ∇H are dual connections with respect to the Fisher information metric iff the functions F and H satisfy

H ( p ) = 1 p F ( p )

Thus the F−connection ∇F is self dual iff the embedding F satisfies the condition

F ( p ) = 1 p F ( p ) F ( p ) = p ( 1 2 ) F ( p ) = 2 p 1 2 = L 0 ( p ) .

That is, Amari’s 0–connection is the only self dual F–connection with respect to the Fisher information metric.

So far, we have considered the statistical manifold S as a subset of P(X), the set of all probability measures on X. Now we relax the condition ∫p(x)dx = 1, and consider S as a subset of P ˜ ( X ), which is defined by

P ˜ ( X ) : = { p : X / p ( x ) > 0 ( x X ) ; X p ( x ) d x < } .

Definition 3.4

Let M be a Riemannian manifold with a Riemannian metric g. Letbe an affine connection on M. If there exists a coordinate system [θi] of M such that∂ij = 0 then we say thatis flat, or alternatively M is flat with respect to, and we call such a coordinate system [θi] an affine coordinate system for.

Definition 3.5

Let S = {p(x; ξ) / ξ = [ξ1, …, ξn] ∈ E ⊆ ℝn} be an n−dimensional statistical manifold. If for some coordinate system [θi]; i = 1, …, n

i j F ( p ( x ; θ ) ) = 0

then we can see from Equation (19) that [θi] is an F−affine coordinate system and that S = {pθ} is F−flat. We call such S as an F−affine manifold.

The condition (49) is equivalent to the existence of the functions C, F1, …, Fn on X such that

F ( p ( x ; θ ) ) = C ( x ) + i = 1 n θ i F i ( x )

Theorem 3.6

For any embedding F, P ˜ ( X ) is an F−affine manifold for finite X.

Proof

Let X = {x1, …., xn} be a finite set constituted by n elements. Let Fi: X→ ℝ be the functions defined by Fi(xj) = δij for i, j = 1, …, n. Let us define n coordinates [θi] by

θ i = F ( p ( x i ) )

Then we get F ( p ( x ) ) = i = 1 n θ i F i ( x ). Therefore P ˜ ( X ) is an F–affine manifold for any embedding F(p).

Remark 3.7

Zhang [13] introduced ρ-representation, which is a generalization of α-representation of Amari. Zhang’s geometry is defined using this ρ-representation together with a convex function. Zhang also defined the ρ-affine family of density functions and discussed its dually flat structure. The F−geometry defined using a general F-representation is different from the Zhang’s geometry. The metric defined in the F-embedding approach is the Fisher information metric and the Riemannian metric defined using the ρ-representation is different from the Fisher information metric. The F-connections defined are not in general dually flat and are different from the dual connections defined by Zhang.

Remark 3.8

On a statistical manifold S, we introduced a dualistic structure (g,F,H), where g is the Fisher information metric andF,H are the dual connections with respect to the Fisher information metric. Since F-connections are symmetric, the manifold S is flat with respect toF iff S is flat with respect toH. Thus if S is flat with respect toF, then (S, g,F,H) is a dually flat space. The dually flat spaces are important in statistical estimation [4].

4. Invariance of the Geometric Structures

For the statistical manifold S = {p(x; ξ) | ξ E ⊆ ℝn}, the parameters are merely labels attached to each point pS, hence the intrinsic geometric properties should be independent of these labels. Consequently, it is natural to consider the invariance properties of the geometric structures under suitable transformations of the variables in a statistical manifold. Here we can consider two kinds of invariance of the geometric structures; covariance under re-parametrization of the parameter of the manifold and invariance under the transformations of the random variable [15]. Now let us investigate the invariance properties of the F-geometric structures defined in Section 2.

4.1. Covariance under Re-Parametrization

Let [θi] and [ηj] be two coordinate systems on S, which are related by an invertible transformation η = η(θ). Let us denote i = θ i and j = η j. Let the coordinate expressions of the metric g be given by gij =<∂i, ∂j> and g ˜ i j = < i , j >. Let the components of the connection ∇ with respect to the coordinates [θi] and [ηj] be given by Γijk, Γ ˜ i j k respectively.

Then the covariance of the metric g and the connection ∇ under the re-parametrization means,

g ˜ i j = m n θ m η i θ n η j g m n
Γ ˜ i j k = m , n , h θ m η i θ n η j θ h η k Γ m n h + m , h θ h η k 2 θ m η i η j g m h

Lemma 4.1

The Fisher information metric g is covariant under re-parametrization.

Proof

The components of the Fisher information metric with respect to the coordinate system [θi] are given by

g i j ( θ ) = < i , j > θ = i p ( x ; θ ) j p ( x ; θ ) 1 p ( x ; θ ) d x .

Let p ˜ ( x ; η ) = p ( x ; θ ( η ) ). then the components of the Fisher information metric with respect to the coordinate system [ηj] are given by

g ˜ i j ( η ) = < i , j > η = i p ˜ ( x ; η ) j p ˜ ( x ; η ) 1 p ˜ ( x ; η ) d x .

Since

i p ˜ ( x ; η ) = m θ m η i p ( x ; θ ( η ) ) θ m

we can write

Entropy 16 02472f11

Lemma 4.2

The F−connectionF is covariant under re-parametrization.

Proof

Let the components of ∇F with respect to the coordinates [θi] and [ηj] be given by Γijk, Γ ˜ i j k respectively.

Let p ˜ ( x ; η ) = p ( x ; θ ( n ) ). Let us denote log p(x; θ) by (x; θ) and log p ˜ ( x ; η ) by ˜ ( x ; η ).

The components of the F−connection ∇F with respect to the coordinate system [θi] are given by

Γ i j k = ( i j ( x ; θ ) + ( 1 + p F ( p ) F ( p ) ) i ( x ; θ ) j ( x ; θ ) ) k ( x ; θ ) p ( x ; θ ) d x

The components of ∇F with respect to the coordinate system [ηj] are given by

Γ ˜ i j k = ( i j ˜ ( x ; η ) + ( 1 + p ˜ F ( p ˜ ) F ( p ˜ ) ) i ˜ ( x ; η ) j ˜ ( x ; η ) ) k ˜ ( x ; η ) p ˜ ( x ; η ) d x

We can write

i ˜ ( x ; η ) = m θ m η i ( x ; θ ( η ) ) θ m

Then

i j ˜ ( x ; η ) = m , n θ m η i θ n η j 2 ( x ; θ ( η ) ) θ m θ n + m 2 θ m η i η j ( x ; θ ( η ) ) θ m
i ˜ ( x ; η ) j ˜ ( x ; η ) = m , n θ m η i θ n η j ( x ; θ ( η ) ) θ m ( x ; θ ( η ) ) θ n
k ˜ ( x ; η ) = h θ h η k ( x ; θ ( η ) ) θ h

Hence we get

Entropy 16 02472f12

Entropy 16 02472f13

Hence we showed that F−connections are covariant under re-parametrization of the parameter. The covariance under re-parametrization actually means that the metric and connections are coordinate independent. Hence we obtained that the F−geometry is coordinate independent.

4.2. Invariance Under the Transformation of the Random Variable

Amari and Nagaoka [4] defined the invariance of Riemannian metric and connections on a statistical manifold under a transformation of the random variable as follows,

Definition 4.3

Let S = {p(x; ξ) | ξ E ⊆ ℝn} be a statistical manifold defined on a sample space X. Let x, y be random variables defined on sample spaces X, Y respectively and φ be a transformation of x to y. Assume that this transformation induces a model S′ = {q(y; ξ) | ξ E ⊆ ℝn} on Y.

Let λ: S → S′ be a diffeomorphism defined as

λ ( p ξ ) = q ξ

Let g =<>, g′ =<>′ be two Riemannian metrics defined on S and Srespectively. Let ∇, ∇′ be two affine connections on S and Srespectively. Then the invariance properties are given by

< X , Y > p = < λ * ( X ) , λ * ( Y ) > λ ( p ) X , Y T p ( S )
λ * ( X Y ) = λ * ( X ) λ * ( Y )

where λ is the push forward map associated with the map λ, which is defined by

λ * ( X ) λ ( p ) = ( d λ ) p ( X )

Now we discuss the invariance properties of the F−geometry under suitable transformations of the random variable. Let us restrict ourselves to the case of smooth one-to-one transformations of the random variable that are in fact statistically interesting. Amari and Nagaoka [4] mentioned a transformation, the sufficient statistic of the parameter of the statistical model, which is widely used in statistical estimation. In fact the one-to-one transformations of the random variable are trivial examples of sufficient statistic.

Consider a statistical manifold S = {p(x; ξ) | ξ E ⊆ ℝn} defined on a sample space X. Let ϕ be a smooth one-to-one transformation of the random variable x to y. Then the density function q(y; ξ) of the induced model S′ takes the form

q ( y : ξ ) = p ( w ( y ) ; ξ ) w ( y )

where w is a function such that x = w(y) and ϕ ( x ) = 1 w ( ϕ ( x ) ).

Let us denote log q(y; ξ) by (qy) and log p(x; ξ) by (px).

Lemma 4.4

The Fisher information metric and Amari’s α-connections are invariant under smooth one-to-one transformations of the random variable.

Proof

Let ϕ be a smooth one-to-one transformation of the random variable x to y. From Equation (69)

p ( x ; ξ ) = q ( ϕ ( x ) ; ξ ) ϕ ( x )
i ( q y ) = i ( p w ( y ) )
i ( q ϕ ( x ) ) = i ( p x )

The Fisher information metric g′ on the induced manifold S′ is given by

Entropy 16 02472f14

which is the Fisher information metric on S.

The components of Amari’s α−connections on the induced manifold S′ are given by

Entropy 16 02472f15

which are the components of Amari’s α−connections on the manifold S. Thus we obtained that the Fisher information metric and Amari’s α-connections are invariant under smooth one-to-one transformations of the random variable.

Now we prove that α-connections are the only F−connections that are invariant under smooth one-to-one transformations of the random variable.

Theorem 4.5

Amari’s α-connections are the only F−connections that are invariant under smooth one-to-one transformations of the random variable.

Proof

Let ϕ be a smooth one-to-one transformation of the random variable x to y. The components of the F −connection of the induced manifold S′ are

Entropy 16 02472f16

and the components of the F−connection of the manifold S are

Entropy 16 02472f17

Then by equating the components Γ i j k F ( q ξ ) , Γ i j k F ( p ξ ) of the F–connection, we get

Entropy 16 02472f18

Then it follows that the condition for F–connection to be invariant under the transformation ϕ is given by

p F ( p ) F ( p ) = k ,

where k is a real constant.

Hence it follows from the Euler’s homogeneous function theorem that the function F′ is a positive homogeneous function in p of degree k. So

F ( λ p ) = λ k F ( p ) for λ > 0.

Since F′ is a positive homogeneous function in the single variable p, without loss of generality we can take,

F ( p ) = p k .

Therefore

F ( p ) = { p k + 1 k + 1 k 1 log p k = 1

Let

k = ( 1 + α ) 2 , α .

we get

F ( p ) = { 2 1 α p 1 α 2 α 1 log p α = 1

which is nothing but Amari’s α–embeddings Lα(p). Hence we obtain that Amari’s α–connections are the only F–connections that are invariant under smooth one-to-one transformations of the random variable.

Remark 4.6

In Section 2, we defined (F, G)-connections using a general embedding function F and a positive smooth function G. We can show that (F, G)-connection is invariant under smooth one-to-one transformation of the random variable when G(p) = c, where c is a real constant and F (p) = Lα(p) (proof is similar to that of Theorem 4.5). The notion of (F, G)−metric and (F, G)−connection provides a more general way of introducing geometric structures on a manifold. We were able to show that the Fisher information metric (up to a constant) and Amari’s α−connections are the only metric and connections belonging to this class that are invariant under both the transformation of the parameter and the one-to-one transformation of the random variable.

5. Conclusions

The Fisher information metric and Amari’s α−connections are widely used in the theory of information geometry and have an important role in the theory of statistical estimation. Amari’s α−connections are defined using a one parameter family of functions, the α−embeddings. We generalized this idea to introduce geometric structures on a statistical manifold S. We considered a general embedding function F of S into ℝX and obtained a geometric structure on S called the F−geometry. Amari’s α−geometry is a special case of F −geometry. A more general way of defining Riemannian metrics and affine connections on a statistical manifold S is given using a positive continuous function G and the embedding F.

Amari’s α−geometry is the only F−geometry that is invariant under both the transformation of the parameter and the random variable or equivalently under the sufficient statistic. We can relax the condition of invariance under the sufficient statistic and can consider other statistically significant transformations as well, which then gives an F−geometry other than α−geometry that is invariant under these statistically significant transformations. We believe that the idea of F −geometry can be used in the further development of the geometric theory of q-exponential families. We look forward to studying these problems in detail later.

We are extremely thankful to Shun-ichi Amari for reading this article and encouraging our learning process. We would like to thank the reviewer who mentioned the references [13,16] that are of great importance in our future work.

Author Contributions

The authors contributed equally to the presented mathematical framework and the writing of the paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Amari, S. Differential geometry of curved exponential families-curvature and information loss. Ann. Statist 1982, 10, 357–385.
  2. Amari, S. Differential-Geometrical Methods in Statistics; Lecture Notes in Statistics; Springer-Verlag: New York, NY, USA, 1985; Volume 28.
  3. Amari, S.; Kumon, M. Differential geometry of Edgeworth expansions in curved exponential family. Ann. Inst. Statist. Math 1983, 35, 1–24.
  4. Amari, S.; Nagaoka, H. Methods of Information Geometry, Translations of Mathematical Monographs; Oxford University Press: Oxford, UK, 2000.
  5. Barndorff-Nielsen, O.E.; Cox, D.R.; Reid, N. The role of differential geometry in statistical theory. Internat. Statist. Rev 1986, 54, 83–96.
  6. Dawid, A.P. A Discussion to Efron’s paper. Ann. Statist 1975, 3, 1231–1234.
  7. Efron, B. Defining the curvature of a statistical problem (with applications to second order efficiency). Ann. Statist 1975, 3, 1189–1242.
  8. Efron, B. The geometry of exponential families. Ann. Statist 1978, 6, 362–376.
  9. Murray, M.K.; Rice, R.W. Differential Geometry and Statistics; Chapman & Hall: London, UK, 1995.
  10. Rao, C.R. Information and accuracy attainable in the estimation of statistical parameters. Bull. Calcutta. Math. Soc 1945, 37, 81–91.
  11. Chentsov, N.N. Statistical Decision Rules and Optimal Inference. Transted in English; Translation of the Mathematical Monographs; American Mathematical Society: Providence, RI, USA, 1982.
  12. Corcuera, J.M.; Giummole, F. A characterization of monotone and regular divergences. Ann. Inst. Statist. Math 1998, 50, 433–450.
  13. Zhang, J. Divergence function, duality and convex analysis. Neur. Comput 2004, 16, 159–195.
  14. Burbea, J. Informative geometry of probability spaces. Expo Math 1986, 4, 347–378.
  15. Wagenaar, D.A. Information Geometry for Neural Networks, Available online: http://www.danielwagenaar.net/res/papers/98-Wage2.pdf (accessed on 13 December 2013).
  16. Amari, S.; Ohara, A.; Matsuzoe, H. Geometry of deformed exponential families: Invariant, dually flat and conformal geometries. Physica A 2012, 391, 4308–4319.
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert