- freely available
- re-usable

*Entropy*
**2014**,
*16*(5),
2472-2487;
doi:10.3390/e16052472

## Abstract

**:**In this paper, we introduce a geometry called F-geometry on a statistical manifold S using an embedding F of S into the space ℝ

_{X}of random variables. Amari’s α−geometry is a special case of F −geometry. Then using the embedding F and a positive smooth function G, we introduce (F, G)−metric and (F, G)−connections that enable one to consider weighted Fisher information metric and weighted connections. The necessary and sufficient condition for two (F, G)−connections to be dual with respect to the (F, G)−metric is obtained. Then we show that Amari’s 0−connection is the only self dual F −connection with respect to the Fisher information metric. Invariance properties of the geometric structures are discussed, which proved that Amari’s α−connections are the only F −connections that are invariant under smooth one-to-one transformations of the random variables.

## 1. Introduction

Geometric study of statistical estimation has opened up an interesting new area called the Information Geometry. Information geometry achieved a remarkable progress through the works of Amari [1,2], and his colleagues [3,4]. In the last few years, many authors have considerably contributed in this area [5–9]. Information geometry has a wide variety of applications in other areas of engineering and science, such as neural networks, machine learning, biology, mathematical finance, control system theory, quantum systems, statistical mechanics, etc.

A statistical manifold of probability distributions is equipped with a Riemannian metric and a pair of dual affine connections [2,4,9]. It was Rao [10] who introduced the idea of using Fisher information as a Riemannian metric in the manifold of probability distributions. Chentsov [11] introduced a family of affine connections on a statistical manifold defined on finite sets. Amari [2] introduced a family of affine connections called α−connections using a one parameter family of functions, the α−embeddings. These α−connections are equivalent to those defined by Chentsov. The Fisher information metric and these affine connections are characterized by invariance with respect to the sufficient statistic [4,12] and play a vital role in the theory of statistical estimation. Zhang [13] generalized Amari’s α−representation and using this general representation together with a convex function he defined a family of divergence functions from the point of view of representational and referential duality. The Riemannian metric and dual connections are defined using these divergence functions.

In this paper, Amari’s idea of using α−embeddings to define geometric structures is extended to a general embedding. This paper is organized as follows. In Section 2, we define an affine connection called F −connection and a Riemannian metric called F −metric using a general embedding F of a statistical manifold S into the space of random variables. We show that F −metric is the Fisher information metric and Amari’s α−geometry is a special case of F −geometry. Further, we introduce (F, G)−metric and (F, G)−connections using the embedding F and a positive smooth function G.

In Section 3, a necessary and sufficient condition for two (F, G)−connections to be dual with respect to the (F, G)−metric is derived and we prove that Amari’s 0−connection is the only self dual F −connection with respect to the Fisher information metric. Then we prove that the set of all positive finite measures on X, for a finite X, has an F −affine manifold structure for any embedding F. In Section 4, invariance properties of the geometric structures are discussed. We prove that the Fisher information metric and Amari’s α−connections are invariant under both the transformation of the parameter and the transformation of the random variable. Further we show that Amari’s α−connections are the only F −connections that are invariant under both the transformation of the parameter and the transformation of the random variable.

Let (X, B) be a measurable space, where X is a non-empty subset of ℝ and B is the σ-field of subsets of X. Let ℝ_{X} be the space of all real valued measurable functions defined on (X, B). Consider an n−dimensional statistical manifold S = {p(x; ξ) / ξ = [ξ^{1}, …, ξ^{n}] ∈
$\mathbb{E}$ ⊆ ℝ^{n}}, with coordinates ξ = [ξ^{1}, …, ξ^{n}], defined on X. S is a subset of P(X), the set of all probability measures on X given by

The tangent space to S at a point p_{ξ} is given by

Define ℓ (x; ξ) = log p(x; ξ) and consider the partial derivatives
$\{\frac{\partial \ell}{\partial {\xi}^{i}}={\partial}_{i}\ell ;i=1,\dots .,n\}$ which are called scores. For the statistical manifold S, ∂_{i}l’s are linearly independent functions in x for a fixed ξ. Let
${T}_{\xi}^{1}(S)$ be the n-dimensional vector space spanned by n functions {∂_{i}ℓ; i = 1, …., n in x. So

Then there is a natural isomorphism between these two vector spaces T_{ξ}(S) and
${T}_{\xi}^{1}(S)$ given by

Obviously, a tangent vector
$A={\displaystyle {\sum}_{i=1}^{n}{A}^{i}{\partial}_{i}\in {T}_{\xi}(S)}$ corresponds to a random variable
$A(x)={\displaystyle {\sum}_{i=1}^{n}{A}^{i}{\partial}_{i}\ell (x;\xi )\in {T}_{\xi}^{1}(S)}$ having the same components A^{i}. Note that T_{ξ}(S) is the differentiation operator representation of the tangent space, while
${T}_{\xi}^{1}(S)$ is the random variable representation of the same tangent space. The space
${T}_{\xi}^{1}(S)$ is called the 1-representation of the tangent space. Let A and B be two tangent vectors in T_{ξ}(S) and A(x) and B(x) be the 1–representations of A and B respectively. We can define an inner product on each tangent space T_{ξ}(S) by

Especially the inner product of the basis vectors ∂_{i} and ∂_{j} is

Note that g = <, > defines a Riemannian metric on S called the **Fisher information metric.** On the Riemannian manifold (S, g =<, >), define n^{3} functions Γ_{ijk} by

These functions Γ_{ijk} uniquely determine an affine connection ∇ on S by

∇ is called the 1−connection or the exponential connection.

Amari [2] defined a one parameter family of functions called the α−embeddings given by

Using these, we can define n^{3} functions
${\mathrm{\Gamma}}_{ijk}^{\alpha}$ by

These
${\mathrm{\Gamma}}_{ijk}^{\alpha}$ uniquely determine affine connections ∇^{α} on the statistical manifold S by

which are called α−**connections**.

## 2. F−Geometry of a Statistical Manifold

On a statistical manifold S, the Fisher information metric and exponential connection are defined using the log embedding. In a similar way, α−connections are defined using a one parameter family of functions, the α−embeddings. In general, we can give other geometric structures on S using different embeddings of the manifold S into the space of random variables ℝ_{X}.

Let F: (0, ∞) → ℝ be an injective function that is at least twice differentiable. Thus we have F′(u) ≠ 0, ∀ u ∈ (0, ∞). F is an embedding of S into ℝ_{X} that takes each p(x; ξ) → F (p(x; ξ)). Denote F (p(x; ξ)) by F (x; ξ) and ∂_{i}F can be written as

It is clear that ∂_{i}F (x; ξ); i = 1, …, n are linearly independent functions in x for fixed ξ since ∂_{i}℞(p(x; ξ)); i = 1, …, n are linearly independent. Let
${T}_{F({p}_{\xi})}$ F (S) be the n-dimensional vector space spanned by n functions ∂_{i}F; i = 1, …., n in x for fixed ξ. So

Let the tangent space
${T}_{F({p}_{\xi})}$ (F (S)) to F (S) at the point F(p_{ξ}) be denoted by
${T}_{\xi}^{F}(S)$. There is a natural isomorphism between the two vector spaces T_{ξ}(S) and
${T}_{\xi}^{F}(S)$ given by

${T}_{\xi}^{F}(S)$ is called the F−representation of the tangent space T_{ξ}(S).

For any
$A={\displaystyle {\sum}_{i=1}^{n}{A}^{i}{\partial}_{i}\in {T}_{\xi}(S)}$, the corresponding
$A(x)={\displaystyle {\sum}_{i=1}^{n}{A}^{i}{\partial}_{i}F\in {T}_{\xi}^{F}(S)}$ is called the F–representation of the tangent vector A and is denoted by A^{F}(x). Note that
${T}_{\xi}^{F}(S)\subseteq {T}_{F({p}_{\xi})}({\mathbb{R}}_{X})$. Since ℝ_{X} is a vector space, its tangent space
${T}_{F({p}_{\xi})}({\mathbb{R}}_{X})$ can be identified with ℝ_{X}. So
${T}_{\xi}^{F}(S)\subseteq ({\mathbb{R}}_{X})$.

#### Definition 2.1

F−expectation of a random variable f with respect to the distribution p(x; ξ) is defined as

We can use this F−expectation to define an inner product in ℝ_{X} by

which induces an inner product on T_{ξ}(S) by

#### Proposition 2.2

The induced metric <, >^{F} on S is the Fisher information metric g =<, > on S.

#### Proof

For any basis vectors ∂_{i}, ∂_{j} ∈ T_{ξ}(S)

So the metric <, >^{F} on S induced by the embedding F of S into ℝ_{X} is the Fisher information metric g =<, > on S.

We can induce a connection on S using the embedding F.

Let ${\pi}_{|{p}_{\xi}}^{F}:{\mathbb{R}}_{X}\to {T}_{\xi}^{F}(S)$ be the projection map.

#### Definition 2.3

The connection induced by the embedding F on S, the F−connection, is defined as

where [g^{mn}(ξ)] is the inverse of the Fisher information matrix G(ξ) = [g_{mn}(ξ)]. Note that the F−connections are symmetric.

#### Lemma 2.4

The F −connection and its components can be written in terms of scores as

and

#### Proof

From Equation (12), we have

Therefore

Hence we can write

Then we have the Christoffel symbols of the F−connection

and components of the F−connection are given by

#### Theorem 2.5

Amari’s α−geometry is a special case of the F−geometry.

#### Proof

Let F(p) = L_{α}(p), L_{α}(p) is the α−embedding of Amari.

The components ${\mathrm{\Gamma}}_{ijk}^{\alpha}$ of the α−connection are given by

From Equation (26), when F (p) = L_{α}(p)

we have

Then we get

Hence

which are the components of the α−connection. Hence F−connection reduces to α−connection. Thus we obtain that α−geometry is a special case of F−geometry. □

#### Remark 2.6

Burbea [14] introduced the concept of weighted Fisher information metric using a positive continuous function. We use this idea to define weighted F−metric and weighted F−connections. Let G: (0, ∞) → ℝ be a positive smooth function and F be an embedding, define (F, G)−expectation of a random variable with respect to the distribution p_{ξ} as

Define (F, G)−metric$<,{>}_{\xi}^{F,G}$ in${T}_{{p}_{\xi}}(S)$ by

Define (F, G)−connection as

When G(p) = 1, (F, G)−connection reduces to the F−connection and the metric <, >^{F,G} reduces to the Fisher information metric. This is a more general way of defining Riemannian metrics and affine connections on a statistical manifold.

## 3. Dual Affine Connections

#### Definition 3.1

Let M be a Riemannian manifold with a Riemannian metric g. Two affine connections, ∇ and ∇^{∗} on the tangent bundle are said to be **dual connections** with respect to the metric g if

holds for any vector fields X, Y, Z on M.

#### Theorem 3.2

Let F, H be two embeddings of statistical manifold S into the space ℝ_{X} of random variables. Let G be a positive smooth function on (0, ∞). Then the (F, G)−connection ∇^{F,G} and the (H, G)−connection ∇^{H,G} are dual connections with respect to the (F, G)−metric iff the functions F and H satisfy

We call such an embedding H as a G−dual embedding of F.

The components of the dual connection ∇^{H,G} can be written as

#### Proof

∇^{F,G} and ∇^{H,G} are dual connections with respect to the G−metric means,

for any basis vectors ∂_{i}, ∂_{j}, ∂_{k} ∈ T_{ξ}(S).

Then the condition (38) holds iff

Hence ∇^{F,G} and ∇^{H,G} are dual connections with respect to the (F, G)−metric iff Equation (36) holds. From Equation (43), we can rewrite the components of dual connection ∇^{H,G} as

#### Corollary 3.3

Amari’s 0−connection is the only self dual F−connection with respect to the Fisher information metric.

#### Proof

From Theorem 3.2, for G(p) = 1 the F−connection ∇^{F} and the H−connection ∇^{H} are dual connections with respect to the Fisher information metric iff the functions F and H satisfy

Thus the F−connection ∇^{F} is self dual iff the embedding F satisfies the condition

That is, Amari’s 0–connection is the only self dual F–connection with respect to the Fisher information metric.

So far, we have considered the statistical manifold S as a subset of P(X), the set of all probability measures on X. Now we relax the condition ∫p(x)dx = 1, and consider S as a subset of $\tilde{P}(X)$, which is defined by

#### Definition 3.4

Let M be a Riemannian manifold with a Riemannian metric g. Let ∇ be an affine connection on M. If there exists a coordinate system [θ^{i}] of M such that ∇_{∂i}∂_{j} = 0 then we say that ∇ is flat, or alternatively M is flat with respect to ∇, and we call such a coordinate system [θ^{i}] an affine coordinate system for ∇.

#### Definition 3.5

Let S = {p(x; ξ) / ξ = [ξ^{1}, …, ξ^{n}] ∈
$\mathbb{E}$ ⊆ ℝ^{n}} be an n−dimensional statistical manifold. If for some coordinate system [θ^{i}]; i = 1, …, n

then we can see from Equation (19) that [θ^{i}] is an F−affine coordinate system and that S = {p_{θ}} is F−flat. We call such S as an F−affine manifold.

The condition (49) is equivalent to the existence of the functions C, F_{1}, …, F_{n} on X such that

#### Theorem 3.6

For any embedding F, $\tilde{P}(X)$ is an F−affine manifold for finite X.

#### Proof

Let X = {x_{1}, …., x_{n}} be a finite set constituted by n elements. Let F_{i}: X→ ℝ be the functions defined by F_{i}(x_{j}) = δ_{ij} for i, j = 1, …, n. Let us define n coordinates [θ^{i}] by

Then we get $F(p(x))={\displaystyle {\sum}_{i=1}^{n}{\theta}^{i}{F}_{i}(x)}$. Therefore $\tilde{P}(X)$ is an F–affine manifold for any embedding F(p).

#### Remark 3.7

Zhang [13] introduced ρ-representation, which is a generalization of α-representation of Amari. Zhang’s geometry is defined using this ρ-representation together with a convex function. Zhang also defined the ρ-affine family of density functions and discussed its dually flat structure. The F−geometry defined using a general F-representation is different from the Zhang’s geometry. The metric defined in the F-embedding approach is the Fisher information metric and the Riemannian metric defined using the ρ-representation is different from the Fisher information metric. The F-connections defined are not in general dually flat and are different from the dual connections defined by Zhang.

#### Remark 3.8

On a statistical manifold S, we introduced a dualistic structure (g, ∇^{F}, ∇^{H}), where g is the Fisher information metric and ∇^{F}, ∇^{H} are the dual connections with respect to the Fisher information metric. Since F-connections are symmetric, the manifold S is flat with respect to ∇^{F} iff S is flat with respect to ∇^{H}. Thus if S is flat with respect to ∇^{F}, then (S, g, ∇^{F}, ∇^{H}) is a dually flat space. The dually flat spaces are important in statistical estimation [4].

## 4. Invariance of the Geometric Structures

For the statistical manifold S = {p(x; ξ) | ξ ∈
$\mathbb{E}$ ⊆ ℝ^{n}}, the parameters are merely labels attached to each point p ∈ S, hence the intrinsic geometric properties should be independent of these labels. Consequently, it is natural to consider the invariance properties of the geometric structures under suitable transformations of the variables in a statistical manifold. Here we can consider two kinds of invariance of the geometric structures; covariance under re-parametrization of the parameter of the manifold and invariance under the transformations of the random variable [15]. Now let us investigate the invariance properties of the F-geometric structures defined in Section 2.

#### 4.1. Covariance under Re-Parametrization

Let [θ^{i}] and [η_{j}] be two coordinate systems on S, which are related by an invertible transformation η = η(θ). Let us denote
${\partial}_{i}=\frac{\partial}{\partial {\theta}^{i}}$ and
${\partial}^{j}=\frac{\partial}{\partial {\eta}_{j}}$. Let the coordinate expressions of the metric g be given by g_{ij} =<∂_{i}, ∂_{j}> and
${\tilde{g}}_{ij}=<{\partial}^{i},{\partial}^{j}>$. Let the components of the connection ∇ with respect to the coordinates [θ^{i}] and [η_{j}] be given by Γ_{ijk},
${\tilde{\mathrm{\Gamma}}}_{ijk}$ respectively.

Then the covariance of the metric g and the connection ∇ under the re-parametrization means,

#### Lemma 4.1

The Fisher information metric g is covariant under re-parametrization.

#### Proof

The components of the Fisher information metric with respect to the coordinate system [θ^{i}] are given by

Let
$\tilde{p}(x;\eta )=p(x;\theta (\eta ))$. then the components of the Fisher information metric with respect to the coordinate system [η_{j}] are given by

Since

we can write

#### Lemma 4.2

The F−connection ∇^{F} is covariant under re-parametrization.

#### Proof

Let the components of ∇^{F} with respect to the coordinates [θ^{i}] and [η_{j}] be given by Γ_{ijk},
${\tilde{\mathrm{\Gamma}}}_{ijk}$ respectively.

Let $\tilde{p}(x;\eta )=p(x;\theta (n))$. Let us denote log p(x; θ) by ℓ(x; θ) and $\mathrm{log}\phantom{\rule{0.2em}{0ex}}\tilde{p}(x;\eta )$ by $\tilde{\ell}(x;\eta )$.

The components of the F−connection ∇^{F} with respect to the coordinate system [θ^{i}] are given by

The components of ∇^{F} with respect to the coordinate system [η_{j}] are given by

We can write

Then

Hence we get

Hence we showed that F−connections are covariant under re-parametrization of the parameter. The covariance under re-parametrization actually means that the metric and connections are coordinate independent. Hence we obtained that the F−geometry is coordinate independent.

#### 4.2. Invariance Under the Transformation of the Random Variable

Amari and Nagaoka [4] defined the invariance of Riemannian metric and connections on a statistical manifold under a transformation of the random variable as follows,

#### Definition 4.3

Let S = {p(x; ξ) | ξ ∈
$\mathbb{E}$ ⊆ ℝ^{n}} be a statistical manifold defined on a sample space X. Let x, y be random variables defined on sample spaces X, Y respectively and φ be a transformation of x to y. Assume that this transformation induces a model S′ = {q(y; ξ) | ξ ∈
$\mathbb{E}$ ⊆ ℝ^{n}} on Y.

Let λ: S → S′ be a diffeomorphism defined as

Let g =<>, g′ =<>′ be two Riemannian metrics defined on S and S′ respectively. Let ∇, ∇′ be two affine connections on S and S′ respectively. Then the invariance properties are given by

where λ_{∗} is the push forward map associated with the map λ, which is defined by

Now we discuss the invariance properties of the F−geometry under suitable transformations of the random variable. Let us restrict ourselves to the case of smooth one-to-one transformations of the random variable that are in fact statistically interesting. Amari and Nagaoka [4] mentioned a transformation, the sufficient statistic of the parameter of the statistical model, which is widely used in statistical estimation. In fact the one-to-one transformations of the random variable are trivial examples of sufficient statistic.

Consider a statistical manifold S = {p(x; ξ) | ξ ∈
$\mathbb{E}$ ⊆ ℝ^{n}} defined on a sample space X. Let ϕ be a smooth one-to-one transformation of the random variable x to y. Then the density function q(y; ξ) of the induced model S′ takes the form

where w is a function such that x = w(y) and ${\varphi}^{\prime}(x)=\frac{1}{{w}^{\prime}(\varphi (x))}$.

Let us denote log q(y; ξ) by ℓ(q_{y}) and log p(x; ξ) by ℓ(p_{x}).

#### Lemma 4.4

The Fisher information metric and Amari’s α-connections are invariant under smooth one-to-one transformations of the random variable.

#### Proof

Let ϕ be a smooth one-to-one transformation of the random variable x to y. From Equation (69)

The Fisher information metric g′ on the induced manifold S′ is given by

which is the Fisher information metric on S.

The components of Amari’s α−connections on the induced manifold S′ are given by

which are the components of Amari’s α−connections on the manifold S. Thus we obtained that the Fisher information metric and Amari’s α-connections are invariant under smooth one-to-one transformations of the random variable.

Now we prove that α-connections are the only F−connections that are invariant under smooth one-to-one transformations of the random variable.

#### Theorem 4.5

Amari’s α-connections are the only F−connections that are invariant under smooth one-to-one transformations of the random variable.

#### Proof

Let ϕ be a smooth one-to-one transformation of the random variable x to y. The components of the F −connection of the induced manifold S′ are

and the components of the F−connection of the manifold S are

Then by equating the components ${{\mathrm{\Gamma}}^{\prime}}_{ijk}^{F}({q}_{\xi}),\phantom{\rule{0.2em}{0ex}}{\mathrm{\Gamma}}_{ijk}^{F}({p}_{\xi})$ of the F–connection, we get

Then it follows that the condition for F–connection to be invariant under the transformation ϕ is given by

where k is a real constant.

Hence it follows from the Euler’s homogeneous function theorem that the function F′ is a positive homogeneous function in p of degree k. So

Since F′ is a positive homogeneous function in the single variable p, without loss of generality we can take,

Therefore

Let

we get

which is nothing but Amari’s α–embeddings L_{α}(p). Hence we obtain that Amari’s α–connections are the only F–connections that are invariant under smooth one-to-one transformations of the random variable.

#### Remark 4.6

In Section 2, we defined (F, G)-connections using a general embedding function F and a positive smooth function G. We can show that (F, G)-connection is invariant under smooth one-to-one transformation of the random variable when G(p) = c, where c is a real constant and F (p) = L_{α}(p) (proof is similar to that of Theorem 4.5). The notion of (F, G)−metric and (F, G)−connection provides a more general way of introducing geometric structures on a manifold. We were able to show that the Fisher information metric (up to a constant) and Amari’s α−connections are the only metric and connections belonging to this class that are invariant under both the transformation of the parameter and the one-to-one transformation of the random variable.

## 5. Conclusions

The Fisher information metric and Amari’s α−connections are widely used in the theory of information geometry and have an important role in the theory of statistical estimation. Amari’s α−connections are defined using a one parameter family of functions, the α−embeddings. We generalized this idea to introduce geometric structures on a statistical manifold S. We considered a general embedding function F of S into ℝ_{X} and obtained a geometric structure on S called the F−geometry. Amari’s α−geometry is a special case of F −geometry. A more general way of defining Riemannian metrics and affine connections on a statistical manifold S is given using a positive continuous function G and the embedding F.

Amari’s α−geometry is the only F−geometry that is invariant under both the transformation of the parameter and the random variable or equivalently under the sufficient statistic. We can relax the condition of invariance under the sufficient statistic and can consider other statistically significant transformations as well, which then gives an F−geometry other than α−geometry that is invariant under these statistically significant transformations. We believe that the idea of F −geometry can be used in the further development of the geometric theory of q-exponential families. We look forward to studying these problems in detail later.

## Acknowledgments

We are extremely thankful to Shun-ichi Amari for reading this article and encouraging our learning process. We would like to thank the reviewer who mentioned the references [13,16] that are of great importance in our future work.

## Author Contributions

The authors contributed equally to the presented mathematical framework and the writing of the paper.

## Conflicts of Interest

The authors declare no conflicts of interest.

## References

- Amari, S. Differential geometry of curved exponential families-curvature and information loss. Ann. Statist
**1982**, 10, 357–385. [Google Scholar] - Amari, S. Differential-Geometrical Methods in Statistics; Lecture Notes in Statistics; Volume 28, Springer-Verlag: New York, NY, USA, 1985. [Google Scholar]
- Amari, S.; Kumon, M. Differential geometry of Edgeworth expansions in curved exponential family. Ann. Inst. Statist. Math
**1983**, 35, 1–24. [Google Scholar] - Amari, S.; Nagaoka, H. Methods of Information Geometry, Translations of Mathematical Monographs; Oxford University Press: Oxford, UK, 2000. [Google Scholar]
- Barndorff-Nielsen, O.E.; Cox, D.R.; Reid, N. The role of differential geometry in statistical theory. Internat. Statist. Rev
**1986**, 54, 83–96. [Google Scholar] - Dawid, A.P. A Discussion to Efron’s paper. Ann. Statist
**1975**, 3, 1231–1234. [Google Scholar] - Efron, B. Defining the curvature of a statistical problem (with applications to second order efficiency). Ann. Statist
**1975**, 3, 1189–1242. [Google Scholar] - Efron, B. The geometry of exponential families. Ann. Statist
**1978**, 6, 362–376. [Google Scholar] - Murray, M.K.; Rice, R.W. Differential Geometry and Statistics; Chapman & Hall: London, UK, 1995. [Google Scholar]
- Rao, C.R. Information and accuracy attainable in the estimation of statistical parameters. Bull. Calcutta. Math. Soc
**1945**, 37, 81–91. [Google Scholar] - Chentsov, N.N. Statistical Decision Rules and Optimal Inference; Transted in English; Translation of the Mathematical Monographs; American Mathematical Society: Providence, RI, USA, 1982. [Google Scholar]
- Corcuera, J.M.; Giummole, F. A characterization of monotone and regular divergences. Ann. Inst. Statist. Math
**1998**, 50, 433–450. [Google Scholar] - Zhang, J. Divergence function, duality and convex analysis. Neur. Comput
**2004**, 16, 159–195. [Google Scholar] - Burbea, J. Informative geometry of probability spaces. Expo Math
**1986**, 4, 347–378. [Google Scholar] - Wagenaar, D.A. Information Geometry for Neural Networks, Available online: http://www.danielwagenaar.net/res/papers/98-Wage2.pdf (accessed on 13 December 2013).
- Amari, S.; Ohara, A.; Matsuzoe, H. Geometry of deformed exponential families: Invariant, dually flat and conformal geometries. Physica A
**2012**, 391, 4308–4319. [Google Scholar]

© 2014 by the authors; licensee MDPI, Basel, Switzerland This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).