Next Article in Journal
Physical Layer Key Generation in 5G and Beyond Wireless Communications: Challenges and Opportunities
Previous Article in Journal
A Novel Uncertainty Management Approach for Air Combat Situation Assessment Based on Improved Belief Entropy
Article Menu
Issue 5 (May) cover image

Export Article

Entropy 2019, 21(5), 496; https://doi.org/10.3390/e21050496

Article
A Deformed Exponential Statistical Manifold
1
Departamento de Matemática, Universidade Regional do Cariri, Juazeiro do Norte-CE 63041-145, Brazil
2
Departamento de Ciências Naturais, Matemática e Estatística, Universidade Federal Rural do Semi-Árido, Mossoró-RN 59625-900, Brazil
3
Curso de Engenharia de Computação, Campus Sobral, Universidade Federal do Ceará, Sobral-CE 62042-280, Brazil
4
Departamento de Engenharia de Teleinformática, Universidade Federal do Ceará, Fortaleza-CE 60020-181, Brazil
*
Author to whom correspondence should be addressed.
Received: 19 April 2019 / Accepted: 13 May 2019 / Published: 15 May 2019

Abstract

:
Consider μ a probability measure and P μ the set of μ -equivalent strictly positive probability densities. To endow P μ with a structure of a C -Banach manifold we use the φ -connection by an open arc, where φ is a deformed exponential function which assumes zero until a certain point and from then on is strictly increasing. This deformed exponential function has as particular cases the q-deformed exponential and κ -exponential functions. Moreover, we find the tangent space of P μ at a point p, and as a consequence the tangent bundle of P μ . We define a divergence using the q-exponential function and we prove that this divergence is related to the q-divergence already known from the literature. We also show that q-exponential and κ -exponential functions can be used to generalize of Rényi divergence.
Keywords:
deformed exponential manifold; statistical manifold; φ-family; information geometry; exponential arcs

1. Introduction

Let P μ be the set of μ -equivalent strictly positive probability densities, where μ is a given probability measure. In order to build a structure to P μ , Amari considered the parametric case, where the construction depends on a parameter belonging to the Euclidean space [1,2]. The case of non-parametric statistical models was initially studied by Pistone and Sempi [3]. In this case, P μ was equipped with a structure of a C -Banach manifold using the Orlicz space associated to an Orlicz function. In a later work [4], Pistone and Cena proved that the probability distribution z belongs to the maximal exponential model to the probability distribution p, if and only if, z is connected to p by an open exponential arc. Moreover, the new manifold structure obtained from the connection by an open exponential arc is equivalent to the one defined in [3,5]. Results involving conditions connecting two probability densities by an open exponential arc were recently studied in [6].
The deformed exponential function was first introduced by Naudts in [7] and studied in more details later in [8,9]. In [10], the authors propose a generalization for the exponential family E p , based in the replacement of the exponential function exp by a deformed exponential function φ . It is then proposed a φ -family of probability distributions denoted by F c φ , with p = φ ( c ) . The described family was modeled on Musielak–Orlicz spaces and a Banach manifold structure to P μ is obtained. As a consequence of such model, a more general form of the Kullback–Leibler divergence was obtained and called φ -divergence. Furthermore, the arcs for the deformed exponential function were investigated and it was provided the necessary and sufficient conditions to connect by a φ -arc any two probability distributions [11]. This result was generalized later by [12,13]. A generalization to exponential arcs was defined in [14] and it also proved that the probability distribution z belongs to the φ -family F c φ if, and only if, z is connected to p by an open φ -arc.
An example of deformed exponential function is the q-exponential one that it was used by Loaiza and Quiceno [15] to define an atlas modeled on essentially bounded function spaces. The charts for the given atlas are defined in terms of connections by an one-dimensional q-exponential model and of the q-deformations of cumulant maps [4]. Moreover, using equivalence class it was constructed the tangent space and the tangent bundle.
In this paper we endow P μ with a structure of a C -Banach manifold using a deformed exponential function. This deformed exponential function has zero value until a certain point and from then on has the behaviour similar to the “classical” exponential function, which is strictly increasing. Particular cases of that function are: q-deformed exponential and κ -exponential. In order to build this structure, as in [15], we divide P μ into equivalence classes using the connection provided by generalized exponential arcs as defined in [14]. Also, we define a set A c φ , that is the connected component of P μ and will be the generalized φ -family of probability distributions. Moreover, by means of the derivative of the transition map, we find the tangent space and, consequently, the tangent bundle. In addition, we define a divergence using the q-exponential function which is related with the q-divergence defined in [15]. Finally, we show that the κ -exponential and q-exponential functions can be used in the generalization of Rényi’s divergence.
The rest of the paper is organized as follows. In Section 2 we revisit some important results about the q-exponential statistical manifold and provide a brief introduction about Musielak–Orlicz spaces. In Section 3, we have our main results. We discuss generalized open exponential arcs and build generalized φ -families of probability distributions. Alterwards, in Section 4, we find the derivative of the transition map and, as a consequence, the tangent space and tangent bundle. Moreover, in Section 5 we define a divergence using the q-exponential function and we use those results to prove that the q-exponential and κ -exponential functions can be used to generalize Rényi’s divergence. Finally, in Section 6 our conclusions and future perspectives are stated.

2. Background and Preliminary Results

The deformed exponential function that we will use to equip P μ with a structure of a C -Banach manifold has as a particular case the q-exponential function and the parametrization domain is obtained from a Musielak–Orlicz space. For this reason, the purpose of this section is to make a brief presentation of the results involving the q-exponential manifold and the Musielak–Orlicz spaces.

2.1. A q-Exponential Statistical Banach Manifold

In the same way as in [15], we consider ( T , Σ , μ ) a probability space and q ( 0 , 1 ) . The q-deformed exponential function is given by [16]
e q x = ( 1 + ( 1 q ) x ) 1 / ( 1 q ) , where 1 1 q x .
Definition 1.
We say that p , z P μ are connected by an one-dimensional q-exponential model if there exists r P μ , u L ( r . μ ) , a real function of a real variable ψ and δ > 0 , such that for all t ( δ , δ ) the function f defined by
f ( t ) = e q t u q ψ ( t ) r ,
satisfies that there are t 0 , t 1 ( δ , δ ) , with f ( t 0 ) = p , f ( t 1 ) = z and t u q ψ ( t ) : = t u ψ ( t ) t u + ( 1 q ) ψ ( t ) , for ψ ( t ) ( q 1 ) 1 .
Consider the following partition of P μ into equivalence classes: p , z P μ are related ( p q z ) if and only if there exists an one-dimensional q-exponential model connecting p and z, according to Equation (2). As a consequence, the measures p . μ and z . μ are equivalent and the essentially bounded function spaces L ( p . μ ) and L ( z . μ ) are equal.
We need to define a family of q-deformations of the moment-generating functional denoted by M p q , it means,
M p q : D M p q [ 0 , ] ,
M p q ( u ) = T e q ( u ) d μ ,
where
D M p q = u L ( p . μ ) ; 1 1 q < u , T e q ( u ) d μ < .
Also, we define a family of cumulant generating functional
K p q : B p , ( 0 , 1 ) [ 0 , ]
where
K p q ( u ) = ln q [ M p q ] .
Notice that B p , ( 0 , 1 ) D M p q , where B p , ( 0 , 1 ) is the open unit ball in L ( p . μ ) . Some properties of the functional K p q are described in the theorem below.
Theorem 1
([15], Theorem 9). The cumulant generating function K p q satisfies:
(1)
The function z = e q u q K p q ( u ) p is a probability density on P μ , since u B p , ( 0 , 1 ) ;
(2)
K p q is infinitely Fréchet differentiable and its n-th derivative evaluated at the directions ( v 1 , , v n ) B p , ( 0 , 1 ) × × B p , ( 0 , 1 ) , is of the form
D n K p q ( u ) . ( v 1 v n ) = [ M p q ( u ) ] 1 q Q n ( q ) T ( v 1 v n ) d μ ;
(3)
The functional K p q is analytic in B p , ( 0 , 1 ) .
The function K p q is used to define the q-exponential models
e q , p : V p P μ ,
where
e q , p ( u ) = e q ( u q K p q ( u ) ) p .
Moreover, the set
B p = u L ( p . μ ) ; T u p d μ = 0
is a Banach space and
V p = { u B p ; | | u | | p , < 1 }
is the open unit ball of B p . Since | | u | | p , < 1 , we obtain 1 1 q < u . Therefore 1 1 q < u K p q ( u ) 1 + ( 1 q ) K p q ( u ) = u q K p q ( u ) and consequently e q , p ( u ) = e q ( u q K p q ( u ) ) p is well defined.
The inverse of e q , p is given by [15]
e q , p 1 ( z ) = ln q z p T ln q z p p d μ 1 + ( 1 q ) T ln q z p p d μ .
The transition map e q , p 2 1 e q , p 1 : e q , p 1 1 ( U p 1 U p 2 ) e q , p 2 1 ( U p 1 U p 2 ) , where U p is the range of e q , p , is expressed as [15]
e q , p 2 1 ( e q , p 1 ( u ) ) = u + [ 1 + ( 1 q ) u ] ln q p 1 p 2 T u + [ 1 + ( 1 q ) u ] ln q p 1 p 2 p 2 d μ 1 + ( 1 q ) T u + [ 1 + ( 1 q ) u ] ln q p 1 p 2 p 2 d μ ,
where p 1 , p 2 P μ with U p 1 U p 2 and u e q , p 1 1 ( U p 1 U p 2 ) .
The map e q , p is injective and the set e q , p 1 ( U p 1 U p 2 ) is open in the B p 1 -topology, where p 1 , p 2 P μ . Hence, the transition map e q , p 2 1 e q , p 1 is a topological homeomorphism and consequently the collection of pairs U p , e q , p 1 p P μ is a C -atlas modeled on B p . Then, P μ is a C -Banach manifold, since e q , p is a parametrization.
There exists a relation between the constructed manifold and the Tsallis relative entropy. In fact, let us consider, for t 0 and 0 < q < 1 , the following function
f ( t ) = t ln q 1 t ,
where ln q ( x ) = x 1 q 1 1 q , if x > 0 . Given p and z in P μ , the Tsallis divergence, also called q-divergence of z with relation to p, is expressed by
I ( q ) ( z | | p ) = T p f z p d μ .
Proposition 1
([15], Proposition 16). Taking p, z in P μ , we obtain
(1)
I ( q ) ( z | | p ) 0 , with equality iff p = z .
(2)
I ( q ) ( z | | p ) T ( z p ) f z p d μ .

2.2. Musielak–Orlicz Spaces and φ -Families of Probability Distributions

Consider ( T , Σ , μ ) a σ -finite, non-atomic measure space. Let P μ = { p L 0 ; p > 0 and T p d μ = 1 } , where L 0 is the linear space of all real-valued, measurable functions on T, with equality μ -a.e. t T . The map Φ : T × [ 0 , ) [ 0 , ] is a Musielak–Orlicz function if, for μ -a.e. (almost everywhere) t T , the following conditions hold [17]:
(1)
Φ ( t , · ) is convex and lower semi-continuous;
(2)
Φ ( t , 0 ) = lim u 0 Φ ( t , u ) = 0 and Φ ( t , ) = ;
(3)
Φ ( · , u ) is measurable for each u 0 .
Since the items (1) and (2) occur, it follows that Φ ( t , . ) is not equal to 0 or in the interval ( 0 , ) .
Consider the functional I Φ ( u ) = T Φ ( t , | u ( t ) | ) d μ , for any u L 0 . The Musielak–Orlicz space, Musielak–Orlicz class, Morse–Transue space associated the a Musielak–Orlicz function Φ are defined, respectively, by
L Φ = { u L 0 ; I Φ ( λ u ) < for each λ ( ε , ε ) , there exists ε > 0 } ,
L ˜ Φ = { u L 0 ; I Φ ( u ) < }
and
E Φ = { u L 0 ; I Φ ( λ u ) < for all λ > 0 } .
Consider the Luxemburg norm
u Φ = inf λ > 0 ; I Φ u λ 1 ,
and the Orlicz norm
u Φ , 0 = sup T u v d μ ; v L ˜ Φ and I Φ ( v ) 1 ,
where Φ ( t , v ) = sup u 0 ( u v Φ ( t , u ) ) is the Fenchel conjugate of Φ ( t , · ) . The Musielak–Orlicz space L Φ equipped with one of these two norms is a Banach space. The norms above are equivalent and the inequalities u Φ u Φ , 0 2 u Φ hold for all u L Φ . For more details see [18,19].
Define the Musielak–Orlicz function as
Φ c ( t , u ) = φ ( t , c ( t ) + u ) φ ( t , c ( t ) ) ,
where c : T R is a measurable function such that φ ( t , c ( t ) ) is μ -integrable and we write L c φ , L ˜ c φ and E c φ , in the place of L Φ c , L ˜ Φ c and E Φ c respectively. In [10] it was defined the parametrization
φ c : B c φ F c φ ,
where
φ c ( u ) = φ ( c + u ψ ( u ) u 0 ) ,
for each u B c φ = B c φ K c φ , and
B c φ = u L c φ ; T u φ + ( c ) d μ = 0 ,
K c φ = u L c φ ; T φ ( c + λ u ) < for each λ ( ε , 1 + ε ) , there exists ε > 0 .
The application ψ : B c φ [ 0 , ) is called the normalizing function and it is defined in such a way that φ c ( u ) = φ ( c + u ψ ( u ) u 0 ) is in P μ . We have that { F c φ ; φ ( c ) P μ } = P μ , φ c 1 1 ( F c 1 φ F c 2 φ ) and φ c 2 1 ( F c 1 φ F c 2 φ ) are open for any c 1 , c 2 : T R measurable such that φ ( c 1 ) and φ ( c 2 ) are in P μ . The transition map is a C -isomorphism and consequently φ c is a parametrization.
In the next section, we will use the generalized open exponential arcs to build a parametrization to P μ .

3. Construction of Generalized φ -Families of Probability Distributions

Let ( T , Σ , μ ) , be a σ -finite, non-atomic measure space and consider a deformed exponential function φ : T × R [ 0 , ) . In other words, φ ( t , · ) is convex for μ -a.e. t T and the limits lim u φ ( t , u ) = 0 , lim u φ ( t , u ) = for μ -a.e. t T hold. In this work we consider two additional conditions on the deformed exponential φ :
(a1)
φ ( t , x ) = 0 , for all x < a φ , where a φ = inf x R ; φ ( x ) > 0 ;
(a2)
given a measurable function c : T R such that T φ ( t , c ( t ) ) d μ = 1 , we have
T φ ( t , c ( t ) + λ ) d μ < , for all λ > 0 .
For a measurable function q : T ( 0 , 1 ) , we define the q-deformed exponential function exp q : T × R [ 0 , ) as exp q ( t , u ) = exp q ( t ) ( u ) , where
exp q ( u ) = [ 1 + ( 1 q ) u ] + 1 / ( 1 q ) ,
and [ 1 + ( 1 q ) u ] + = max { 1 + ( 1 q ) u , 0 } . In this case, the q-deformed exponential function satisfies the condition (a1) with a φ = 1 1 q . In the next example, we prove that the q-deformed exponential function satisfies the condition (a2) for 0 < q < 1 .
Example 1.
Given α 1 , we consider two cases:
If u 0 , we have that α u u . Then,
exp q ( α u ) exp q ( u ) α 1 1 q exp q ( u ) .
If u > 0 , we obtain
exp q ( α u ) = 1 + ( 1 q ) α u 1 1 q = ( α α 1 + ( 1 q ) α u ) 1 1 q = α 1 1 q ( α 1 + ( 1 q ) u ) ) 1 1 q α 1 1 q ( 1 + ( 1 q ) u ) ) 1 1 q = α 1 1 q exp q ( u ) .
By the convexity property of exp q ( t , . ) , we obtain for any λ ( 0 , 1 ) that
exp q ( c + u ) λ exp q ( λ 1 c ) + ( 1 λ ) exp q ( ( 1 λ ) 1 u ) λ 1 1 / ( 1 q ) exp q ( c ) + ( 1 λ ) 1 1 / ( 1 q ) exp q ( u ) .
Then, any positive function u 0 : T ( 0 , ) such that T exp q ( u 0 ) d μ < satisfies T exp q ( c + λ u 0 ) d μ < for all λ > 0 .
Now, we provide an example of a deformed exponential function that satisfies condition (a1), but does not satisfy condition (a2).
Example 2.
Consider the function
φ ( u ) = e ( u + 1 ) 2 / 2 , u 0 e 1 / 2 ( u + 1 ) , 1 u 0 , 0 , u 1
where the measure μ is σ-finite and non atomic. Note that φ is convex, and satisfies φ ( x ) = 0 , for all x < a φ , where a φ = inf { x R ; φ ( x ) > 0 } and lim u φ ( u ) = . We will find a measurable function c : T R with T φ ( c ) d μ < , but T φ ( c + λ ) d μ = , for some λ > 0 . For each m 1 , we consider
v m ( t ) : = m log ( 2 ) 3 2 1 E m ( t ) ,
where E m = t T ; m log ( 2 ) 3 2 > 0 and 1 E m ( t ) = 1 , t E m ( t ) 0 , t E m ( t ) . Since v m , we can find a subsequence { v m n } such that
E m n e ( v m n + 2 ) 2 / 2 d μ 2 n .
According to [17], there exists a subsequence w k = v m n k and pairwise disjoint sets A k E m n k for which
A k e ( v m n + 2 ) 2 / 2 d μ = 1 .
Let us define c = c ¯ 1 T A + k = 1 w k 1 A k where A = k = 1 A k and c ¯ is any measurable function such that φ ( c ¯ ( t ) ) > 0 for t T A and T A φ ( c ¯ ) d μ < . Observing that
e ( w k ( t ) + 2 ) 2 / 2 = 2 m n k e ( w k ( t ) + 1 ) 2 / 2 , f o r t A k ,
we obtain
A k e ( w k ( t ) + 1 ) 2 / 2 d μ = 1 2 m n k , f o r   e v e r y m 1 .
Hence, we can write
T φ ( c ) d μ = T A φ ( c ¯ ) d μ + k = 1 A k e ( w k ( t ) + 1 ) 2 / 2 d μ = T A φ ( c ¯ ) d μ + k = 1 1 2 m n k < .
On the other hand, we also have
T φ ( c + 1 ) d μ = T A φ ( c ¯ ) d μ + k = 1 A k e ( w k ( t ) + 2 ) 2 / 2 d μ = T A φ ( c ¯ ) d μ + k = 1 1 = ,
which shows that (a2) is not satisfied.
Definition 2.
We say that p and z in P μ are φ-connected by an open arc, if there exists an open interval I [ 0 , 1 ] and a constant κ ( α ) , such that
φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ( α ) ) P μ ,
for each α I , where κ ( α ) depends of α, p and z.
According to the proof proved in [11], we have that κ ( α ) 0 for each α [ 0 , 1 ] . Indeed,
  • for α = 0 , 1 , we have clearly that κ ( α ) = 0 ;
  • for α ( 0 , 1 ) , the convexity of the function of the φ ensures that 0 φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) ) < ( 1 α ) p + α z . Integrating the inequality we obtain
    0 T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) ) d μ 1 .
    Since κ ( α ) satisfies
    T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ( α ) ) d μ = 1 ,
    then κ ( α ) 0 , for α [ 0 , 1 ] .
Now we will define, by using generalized exponential arcs, important sets for the construction of generalized φ -family of probability distributions. Let us define
κ ˜ ( α ) = sup λ > 0 ; ε > 0 where ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) λ > a φ , μ - a . e . t T , for each α ( ε , 1 + ε ) .
as p and z are φ -connected by an open arc, we have that ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ( α ) > a φ , for each α I . Hence, κ ( α ) < κ ˜ ( α ) , i.e., κ ( α ) [ , κ ˜ ( α ) ) . For p P μ , where p = φ ( c ) , consider the set
R c φ = q P μ ; ε > 0 where ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ˜ ( α ) a φ , μ - a . e . t T , for each α ( ε , 1 + ε ) .
We will show that the set
A c φ = q R c φ ; ε > 0 where T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ˜ ( α ) ) d μ < 1 , for each α ( ε , 1 + ε )
is a generalized φ -family of probability distributions.
Consider the partition of P μ into equivalence classes using the following relation: given p, z P μ we say that p z if and only if p and z are φ -connected by an open arc. This equivalence relation is necessary to define an atlas modeled on Banach spaces.
Consider then L c φ be the Musielak–Orlicz space, given as
L c φ = u L 0 ; ε > 0 where T φ ( c + λ u ) d μ < , for each λ ( ε , ε )
and the set
N c φ = u L c φ ; ε ( 0 , 1 ) where c + λ u a φ , for each λ [ ε , ε ] , .
Lemma 1.
The set N c φ is a closed subspace.
Proof. 
Clearly 0 N c φ . Given u , v N c φ , there exist ε 1 , ε 2 ( 0 , 1 ) , such that
c + λ u a φ , μ - a . e . , for each λ [ ε 1 , ε 1 ]
and
c + λ v a φ , μ - a . e . , for each λ [ ε 2 , ε 2 ] .
Considering ε = m i n { ε 1 , ε 2 } , we have that u + v N c φ . Finally, given α R we obtain α u N c φ , since c + λ ( α u ) a φ , μ - a . e . , for each λ ε 1 α , ε 1 α .
The fact that remains to show is that N c φ is closed. For this, let ( u n ) N c φ , convergent μ -a.e. for u L c φ . This implies that there exists a subsequence ( u n ) , such that c + λ u n c + λ u , μ - a.e. t T .
Then, for each n N we can find ε n ( 0 , 1 ) , with c + λ u n a φ , μ - a.e. t T , for each λ [ ε n , ε n ] .
The compactness of [ ε n , ε n ] ensures that the coverage ( ε ¯ n δ , ε ¯ n + δ ) ; n N admits a finite undercoverage. Let { ε ¯ 1 δ , ε 1 + δ , , ε ¯ n 0 δ , ε n 0 + δ } the set of the elements that constitute the finite undercoverage. Taking ε ¯ = min { ε ¯ 1 δ , ε ¯ 1 + δ , , ε ¯ n 0 δ , ε ¯ n 0 + δ } , it follows that c + λ u n a φ , μ - a.e. t T , for each λ [ ε ¯ , ε ¯ ] .
Passing to the limit, we obtain c + λ u a φ , μ - a.e. t T , for each λ [ ε ¯ , ε ¯ ] . Therefore, u N c φ and consequently N c φ is closed.  □
Define the set
K ˜ c φ = u N c φ ; ε ( 0 , 1 ) , such that T φ ( c + λ u ) d μ < , for each λ ( ε , 1 + ε ) .
Lemma 2.
The set K ˜ c φ is open in N c φ .
Proof. 
Let u K ˜ c φ . Then, there exists ε ( 0 , 1 ) , such that T φ ( c + α u ) d μ < for each α [ ε , 1 + ε ] and u N c φ . Considering δ = 2 ε ( 1 + ε ) 1 + ε 2 1 , we have that for any v B δ = w N c φ ; | | w | | Φ c < δ it occurs I Φ c v δ 1 and consequently T φ c + 1 δ | v | d μ 2 . Given α 0 , 1 + ε 2 we denote λ = α 1 + ε . The inequality
α 1 λ = α 1 α 1 + ε 1 + ε 2 1 1 + ε 2 1 + ε = 2 ε ( 1 + ε ) 1 + ε 2 = 1 δ ,
implies
φ ( c + α ( u + v ) ) φ λ φ c + α λ + ( 1 λ ) φ c + α 1 λ v λ φ c + α λ + ( 1 λ ) φ c + α 1 λ v λ φ ( c + ( 1 + ϵ ) u ) + ( 1 λ ) φ c + 1 δ | v | .
For α ε 2 , 0 , we can write
φ ( c + α ( u + v ) ) 1 2 φ ( c + 2 α u ) + 1 2 φ c + 2 α v 1 2 φ ( c + 2 α u ) + 1 2 φ c + | v | .
Then, we have
T φ ( c + α ( u + v ) ) d μ < ,
for any α ε 2 , 1 + ε 2 . Hence, u + v K c φ and since N c φ is a subspace, we obtain u + v K ˜ c φ . As a consequence, B δ ( u ) is contained in K ˜ c φ and therefore the set K ˜ c φ is open.  □
The set K ˜ c φ defined in (14) is important to guarantee that φ ( c + α u ) may be in P μ . Now, we establish a relationship between the connection by an open arc and K ˜ c φ similar to that was proved in [14].
Proposition 2.
Fix p P μ . We say that z P μ is φ-connected to p by an open arc, if and only if, there exists an open interval I [ 0 , 1 ] and a random variable u L c φ , such that p ( α ) φ ( c + α u ) P μ , for each α I and p ( 0 ) = p and p ( 1 ) = z .
Proof. 
Since that z is φ -connected to p by an open arc, there exists an interval I [ 0 , 1 ] , such that T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) ) d μ < , for each α I . Considering u = φ 1 ( z ) φ 1 ( p ) , we have
T φ ( c + α u ) d μ = T φ ( φ 1 ( p ) + α ( φ 1 ( z ) φ 1 ( p ) ) ) d μ = T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) ) d μ ,
where u = φ 1 ( z ) φ 1 ( p ) and φ ( c ) = p . Therefore u L c φ . Another conclusion that arises from the fact of q is φ -connected to p by a open arc is that ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ( α ) > a φ . Hence, p ( α ) φ ( c + α u ) P μ , for each α I and p ( 0 ) = p and p ( 1 ) = z .
Reciprocally, taking p ( 1 ) = q , we get φ ( c + u ) = z , and consequently u = φ 1 ( z ) φ 1 ( p ) with φ ( c ) = p = p ( 0 ) .  □
One should notice that as a consequence of Proposition 2, given p , z P μ φ -connected by an open arc, the random variable u K ˜ c φ = K c φ N c φ . In fact, this follows from two reasons: as p , z P μ it follows that φ 1 ( p ) , φ 1 ( z ) > a φ and as z is φ -connected the p by an open arc we have T φ ( c + α u ) d μ < for each α ( ε , 1 + ε ) .
Remark 1.
Since the function φ-arc is injective, in the Proposition 2 only the case z p is considered. Therefore, there exists z A c φ such that z p .
Lemma 3.
Let z A c φ φ-connected to p by an open arc. The map
V ( λ ) = T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) λ ) d μ
is then well defined. Moreover, V ( λ ) is strictly increasing.
Proof. 
Proposition 2 ensures that p ( α ) φ ( c + α u ) P μ , where u = φ 1 ( z ) φ 1 ( p ) K ˜ c φ and φ ( c + u ) = z . Then, we can find ε > 0 such that φ ( c + ( 1 + ε ) ( φ 1 ( z ) φ 1 ( p ) ) ) is μ -integrable. Given α ( ε , 1 + ε ) , taking λ ¯ = α 1 + ε , we obtain
φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) λ ) = φ λ ¯ c + α λ ¯ ( φ 1 ( z ) φ 1 ( p ) ) + ( 1 λ ¯ ) c + α 1 λ ¯ λ λ ¯ φ c + α λ ¯ ( φ 1 ( z ) φ 1 ( p ) ) + ( 1 λ ¯ ) φ c + α 1 λ ¯ λ ,
and consequently φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) λ ) is μ -integrable, for every λ R and for each α ( ε , 1 + ε ) . This proves that V ( λ ) is well defined. By the dominated convergence theorem, the map λ V ( λ ) = T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) λ ) d μ is continuous, lim λ V ( λ ) = 0 and lim λ V ( λ ) = . Hence, given λ { λ R ; ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) λ > a φ , μ - a . e . t T , for each α ( ε , 1 + ε ) } , we have that V ( λ ) is strictly increasing.  □
Proposition 3.
Fix p = φ ( c ) P μ and z R c φ . Then, z A c φ if, and only if z is φ-connected the p by a open arc.
Proof. 
Given z A c φ there exists ε > 0 , such that T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ˜ ( α ) ) d μ < 1 and ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ˜ ( α ) a φ , μ -a.e. t T for each α ( ε , 1 + ε ) . Then, T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) ) d μ < for each α ( ε , 1 + ε ) which ensures that q is φ -connected to p by an open arc.
Reciprocally, take z R c φ φ -connected to p by an open arc. In this way there exists ε > 0 , where T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ( α ) ) d μ = 1 and ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ( α ) > a φ , for each α ( ε , 1 + ε ) . Note that
T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ˜ ( α ) ) d μ T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ( α ) ) d μ = 1 ,
because κ ( α ) < κ ˜ ( α ) , for each α ( ε , 1 + ε ) and φ is non-decreasing. Suppose that z A c φ , there exists α ( ε , 1 + ε ) such that
T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ˜ ( α ) ) d μ 1 .
The Equations (15) and (16) ensure that T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ˜ ( α ) ) d μ = 1 for each α ( ε , 1 + ε ) . Therefore, by Lemma 3 it exists a unique λ 0 satisfying ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) λ 0 > a φ , μ - a . e . t T , such that V ( λ 0 ) = 1 . Since κ ( α ) is such that ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ( α ) > a φ , μ - a . e . t T , for each α ( ε , 1 + ε ) , it follows that λ 0 = κ ( α ) and consequently κ ( α ) is unique. Hence, κ ( α ) = κ ˜ ( α ) for each α ( ε , 1 + ε ) , that is an absurd.  □
By Corollary 3 the sets A c φ are the connected components of P μ . Then, we need to find a domain for the parametrization in such a way that the image is A c φ .
We will make some similar considerations to the ones present in [10].
Remark that, for u K ˜ c φ , φ ( c + u ) is not necessarily in P μ . Define ψ : K ˜ c φ R , such that the density
φ c ( u ) = φ ( c + u ψ ( u ) )
is contained in P μ . We have that the open domain maximal of ψ is contained in K ˜ c φ . Note that ψ is well defined, since c + u ψ ( u ) > a φ , μ - a . e . t T . It can be then proved that ψ : K ˜ c φ R is convex, and as a consequence ψ : K ˜ c φ R is continuous, since K ˜ c φ is open by Lemma 2.
Let φ + be the operator acting on the set of real-valued functions u : T R given by φ + ( u ) ( t ) = φ + ( t , u ( t ) ) , where φ + ( t , . ) is the right-derivative of φ ( t , . ) . Also, notice that the function ψ : K ˜ c φ R can assume both positive and negative values. Consider the closed subspace
B ˜ c φ = u N c φ ; T u φ + ( c ) d μ = 0 .
Observe that the image of ψ will be contained in [ 0 , ) , since the domain of ψ is restricted to a B ˜ c φ . By the convexity property of φ ( t , . ) , we have
u φ + ( t , c ( t ) ) φ ( t , c ( t ) + u ) φ ( t , c ( t ) ) for all u R .
Hence, we have that
1 = u φ + ( c ) d μ + φ ( c ) d μ φ ( c + u ) d μ < for any u K ˜ c φ B ˜ c φ = B ˜ c φ .
Thus, it follows that ψ ( u ) 0 in order to φ ( c + u ψ ( u ) ) be in P μ .
Given a measurable function c : T R such that p = φ ( c ) is a probability density in P μ . Consider the set
M c φ = ( M c φ ) 1 ( M c φ ) 2 ,
where
( M c φ ) 1 = { u B ˜ c φ ; c + α ( u ψ ( u ) ) κ ˜ ( α ) ) > a φ μ - a . e . for each α I [ 0 , 1 ] }
and
( M c φ ) 2 = u B ˜ c φ ; T φ ( c + α ( u ψ ( u ) ) κ ˜ ( α ) ) ) d μ < 1 , for each α I [ 0 , 1 ] .
Proposition 4.
Given u M c φ , we have that φ ( c + u ψ ( u ) ) A c φ .
Proof. 
Given u M c φ , we have
c + α ( u ψ ( u ) ) + κ ˜ ( α ) ) > a φ
and
T φ ( c + α u ( α ψ ( u ) + κ ˜ ( α ) ) ) d μ < 1 , μ - a . e . t T , for each α I [ 0 , 1 ] .
Hence,
( 1 α ) φ 1 ( p ) + α φ 1 ( φ ( c + u ψ ( u ) ) ) κ ˜ ( α ) = ( 1 α ) c + α ( c + u ψ ( u ) ) κ ˜ ( α ) = c + α ( u ψ ( u ) ) κ ˜ ( α ) > a φ ,
for each α I [ 0 , 1 ] , which implies in φ ( c + u ψ ( u ) ) R c φ . In addition,
T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( φ ( c + u ψ ( u ) ) κ ˜ ( α ) ) d μ = T φ ( c + α ( u ψ ( u ) ) κ ˜ ( α ) ) d μ < 1 ,
for each α I [ 0 , 1 ] and therefore, φ ( c + u ψ ( u ) ) A c φ .  □
Proposition 5.
The set M c φ is open in B c φ .
Proof. 
Consider the sets
( M c φ ) 1 = { u B ˜ c φ ; c + α ( u ψ ( u ) ) κ ˜ ( α ) > a φ μ - a . e . for each α I [ 0 , 1 ] }
and
( M c φ ) 2 = u B ˜ c φ ; T φ ( c + α ( u ψ ( u ) ) κ ˜ ( α ) ) d μ < 1 , for each α I [ 0 , 1 ] .
Define the functions
f ( α , u ) = c + α u α ψ ( u ) κ ˜ ( α ) a n d g ( α , u ) = T φ ( c + α u α ψ ( u ) κ ˜ ( α ) ) d μ .
  • The function f is well defined and continuous, since ψ : K ˜ c φ R is continuous;
  • The map g is well defined in ( M c φ ) 2 and continuous, since φ and ψ are continuous.
Moreover, given u M c φ , in particular u ( M c φ ) 1 and u ( M c φ ) 2 . By the continuity of f and g respectively, exist ε 1 , ε 2 ( 0 , 1 ) , such that for each v 1 B ε 1 ( u ) B c φ , we have f ( v 1 ) > a φ and for each v 2 B ε 2 ( u ) B c φ , we have g ( v 2 ) < 1 . Taking, ε = min ε 1 , ε 2 , we obtain that B ε ( u ) M c φ and consequently M c φ is open in B c φ .  □
Clearly P μ = { A c φ ; φ ( c ) P μ } . Consider the measurable functions c 1 , c 2 : T R , where p 1 = φ ( c 1 ) and p 2 = φ ( c 2 ) belong to P μ . The parametrization φ c 1 : M c 1 φ A c 1 φ and φ c 2 : M c 2 φ A c 2 φ have a transition map given as
φ c 2 1 φ c 1 : φ c 1 1 ( A c 1 φ A c 2 φ ) φ c 2 1 ( A c 1 φ A c 2 φ ) .
Given ψ 1 : M c 1 φ [ 0 , ) and ψ 2 : M c 2 φ [ 0 , ) being the normalizing functions associated to c 1 and c 2 , respectively, and the functions u M c 1 φ and v M c 2 φ are such that φ c 1 ( u ) = φ c 2 ( v ) A c 1 φ A c 2 φ . So, we have
v = c 1 c 2 + u ψ 1 ( u ) + ψ 2 ( v ) .
Multiplying the Equation (18) by ( φ ) + ( c 2 ) and integrating with respect to the measure μ , once the function v is in M c 2 φ , we obtain
0 = T ( c 1 c 2 + u ) ( φ ) + ( c 2 ) d μ ψ 1 ( u ) T ( φ ) + ( c 2 ) d μ + ψ 2 ( v ) T ( φ ) + ( c 2 ) d μ ,
and we can write
ψ 2 ( v ) = T ( c 1 c 2 + u ) ( φ ) + ( c 2 ) d μ T ( φ ) + ( c 2 ) d μ + ψ 1 ( u ) T ( φ ) + ( c 2 ) d μ T ( φ ) + ( c 2 ) d μ .
Therefore
v = c 1 c 2 + u ψ 1 ( u ) T ( c 1 c 2 + u ) ( φ ) + ( c 2 ) d μ T ( φ ) + ( c 2 ) d μ + ψ 1 ( u ) T ( φ q ) + ( c 2 ) d μ T ( φ q ) + ( c 2 ) d μ .
Hence, the transition map φ c 2 1 φ c 1 can be expressed as
φ c 2 1 φ c 1 ( w ) = c 1 c 2 + w T ( c 1 c 2 + w ) ( φ ) + ( c 2 ) d μ T ( φ ) + ( c 2 ) d μ ,
for every w φ c 1 1 A c 1 φ A c 2 φ . Showing that w and c 1 c 2 are in L c 2 φ and the spaces L c 1 φ and L c 2 φ have equivalent norms we obtain that this transition map will be of class C .
In the next corollary we have that Musielak–Orlicz spaces are equal. The proof follows as the one provided in [14].
Corollary 1.
Let p , z P μ φ-connected by an open arc, where p = φ ( c ) and z = φ ( c ˜ ) . Then, L c φ = L c ˜ φ .
Proof. 
We have that z is φ -connected to p by a open arc. Then, by Corollary 3, we have that c ˜ = c + u ψ ( u ) . The result follows immediately from [10].  □
It follows from Corollary 1 that φ c 2 1 φ c 1 is of class C , and consequently, the set φ c 1 1 A c 1 φ A c 2 φ is open in B c φ .
Proposition 6
([14], Proposition 8). The relation given in the Definition 2 is an equivalence relation.
Proof. 
Since reflexivity and symmetry properties immediately follow from the definition, we will only prove transitivity. Let be p , z , s P μ , such that,
p ( t ) φ ( c + t u ) , s ( t ) φ ( c + t v ) , t ( ε , 1 + ε )
with p ( 0 ) = φ ( c ) = p , p ( 1 ) = φ ( c + u ) = z , s ( 0 ) = φ ( c ) = p , s ( 1 ) = φ ( c + v ) = s and u , v N c φ . Consider
z ( t ) φ ( c + ( 1 t ) u + t v ) φ ( c + u + t ( v u ) )
is defined with c + u = c ˜ , p ( t ) φ ( c ˜ + t ( v u ) ) , where z ( 0 ) = φ ( c ˜ ) = φ ( c + u ) = z , z ( 1 ) = φ ( c ˜ + ( v u ) ) = φ ( c + v ) = s . Therefore z and s are φ -connected.  □
As a consequence of the Corollary 3 and of the Proposition 6 we have that the φ -families A c φ are maximal, in the sense that A c φ A c ˜ φ = or if A c φ A c ˜ φ , then A c φ = A c ˜ φ .
Hence, we can write the following proposition.
Proposition 7.
The collection M c φ , φ c φ ( c ) P μ equip P μ with a C -differentiable structure.

4. The Tangent Bundle

In the previous section, the expression of the transition application φ c 2 1 φ c 1 was important to garantee that P μ could be equipped with a C -Banach structure. Now, we will use the transition application to find the tangent space of P μ at the point p = φ ( c ) and the tangent bundle.
Given p P μ , we consider the triple ( A c φ ; φ c 1 ; v ) , where A c φ is the φ -family, φ c is the parametrization and v is a vector in φ c 1 ( A c φ ) which is contained in the vector space L Φ c .
Let us define the following equivalence relation:
( A c φ ; φ c 1 ; v ) ( A c ˜ φ ; φ c ˜ 1 ; w ) ( φ c ˜ 1 φ c ) ( φ c ( p ) ) ( v ) = w .
The class [ A c φ ; φ c 1 ; v ] is called the tangent vector of P μ in p and the set of all classes is called the tangent space and is denoted by T p ( P μ ) . For more details we refer the reader to [20].
The vector v φ c 1 ( A c φ ) is the velocity vector of a curve in the parametrization domain. In fact, consider ( A c 1 φ , φ c 1 1 ) and ( A c 2 φ , φ c 2 1 ) be charts about p P μ and g : I T P μ a curve such that g ( t 0 ) = p , for some t 0 T . Taking g ( t ) = φ c 1 ( u 1 ) = φ ( c 1 + u 1 ψ ( u 1 ) ) , we have that u 1 ( t ) = φ c 1 1 ( g ( t ) ) . Moreover, g ( t ) = φ c 1 ( u 1 ) and u 2 ( t ) = φ c 2 1 ( g ( t ) ) . Using random variables we have that u 2 ( t 0 ) = φ c 2 1 ( g ( t 0 ) ) = φ c 2 1 φ c 1 ( u 1 ( t 0 ) ) . Hence, by the chain rule we can write
u 2 ( t 0 ) = ( φ c 2 1 φ c 1 ) ( u 1 ( t 0 ) ) u 1 ( t 0 ) = ( φ c 2 1 φ c 1 ) ( φ c 1 1 ( p ) ) u 1 ( t 0 ) .
We will denote τ ( P μ ) as the tangent bundle, which is defined as the disjointed unity of T p ( P μ ) , that is,
τ ( P μ ) = p P μ T p ( P μ ) .
Proposition 8.
The local representation of the tangent bundle τ ( P μ ) is of the form
( u 1 , v 1 ) φ c 2 1 φ c 1 ( u 1 ) , v 1 T v 1 ( φ ) + ( c 2 ) d μ T ( φ ) + ( c 2 ) d μ K ˜ c 2 φ × L c 2 φ .
Proof. 
Given w φ c 1 1 A c 1 φ A c 2 φ , we have that the derivative of the map φ c 2 1 φ c 1 evaluated at w in the direction of v L c φ is of the form
φ c 2 1 φ c 1 ( w ) v = v T v ( φ ) + ( c 2 ) d μ T ( φ ) + ( c 2 ) d μ .
In fact, by the convexity of φ , we have that
T ( c 1 c 2 + w ) ( φ ) + ( c 2 ) d μ T [ φ ( c 1 + w ) + φ ( c 2 ) ] d μ .
Since w φ c 1 1 A c 1 φ A c 2 φ K ˜ c 1 φ , we have that φ ( c 1 + w ) is μ -integrable, and consequently, T ( c 1 c 2 + w ) ( φ ) + ( c 2 ) d μ is μ -integrable. Then, from the dominated convergence theorem follows that (21) occurs.
The tangent bundle is then denoted by
τ ( P μ ) = { ( φ c ( u ) , v ) ; φ c ( u ) A c φ P μ   and   v   is   a   tangent   vector   to   φ c ( u ) } .
Its charts are expressed as
( v , u ) τ ( A c φ ) φ c 2 1 ( v ) , v T v ( φ ) + ( c 2 ) d μ T ( φ ) + ( c 2 ) d μ ,
which was defined in the collection of open subsets A c 1 φ × K ˜ c 1 φ of P μ × L c 1 φ . Then, since Equation (21) occurs, the transition mappings are given for ( u 1 , v 1 ) K ˜ c 1 φ × L c 1 φ by
( u 1 , v 1 ) φ c 2 1 φ c 1 ( u 1 ) , v 1 T v 1 ( φ ) + ( c 2 ) d μ T ( φ ) + ( c 2 ) d μ K ˜ c 2 φ × L c 2 φ .
  □

5. Divergence in Statistical Manifolds

This section will be divided into two parts. The first one is responsible by the definition of the φ -divergence for the case where φ is the deformed exponential defined in Section 3 and to define a divergence using the q-exponential. In the second part, we prove that the q-exponential and κ -exponential functions can be used to generalize the divergence of Rényi [13,21].

5.1. The φ -Divergence and q-Divergence

To define the divergence associated to the normalization function ψ : K ˜ c φ R is necessary the convexity of ψ . This is guaranteed by the fact that N c φ is a subspace and ψ : K c φ R is convex [10]. In this way, the Bregman’s divergence B ψ : B ˜ c φ × B ˜ c φ [ 0 , ) associated the ψ : B ˜ c φ [ 0 , ) is given by [22,23,24]
B ψ ( v , u ) = ψ ( v ) ψ ( u ) + ψ ( u ) ( v u ) .
Then, we can define the divergence D ψ : B ˜ c φ × B ˜ c φ [ 0 , ) related the generalized φ -family A c φ as D ψ ( u , v ) = B ψ ( v , u ) .
Given u , v B ˜ c φ , we have that φ ( c + u ψ ( u ) ) , φ ( c + v ψ ( v ) ) P μ and as a consequence c + u ψ ( u ) , c + v ψ ( v ) > a φ . Supposing φ is continuously differentiable, it follows that the divergence D ψ does not depend on the parametrization of A c φ . This allows us to define the divergence between the probability densities p = φ c ( u ) and z = φ c ( v ) , for u , v B ˜ c φ as
D ( p z ) = D ψ ( u , v ) = T φ c 1 ( p ) φ c 1 ( z ) ( φ c 1 ) ( p ) d μ T 1 ( φ c 1 ) ( p ) d μ .
Note that the divergence is well defined inside the same φ -family. The condition D ( p z ) = if p and z are not in the same φ -family extends the divergence for P μ . We will denote those divergence by D φ and called it φ -divergence [10].
Given u , v B ˜ c φ , we have that u , v > a φ , then φ ( t , . ) is strictly convex in B ˜ c φ , and therefore D φ is always non-negative and D φ ( p z ) is equal to zero if and only if p = z . In the following example, we find the φ -divergence for the case in which the deformed exponential function φ is the q-deformed exponential function.
Example 3.
Consider the q-exponential exp q ( t , u ) = exp q ( t ) ( u ) instead of φ ( t , u ) , whose inverse φ 1 ( t , u ) is the q-logarithm ln q ( t , u ) = ln q ( t ) ( u ) . Then, we have
D ( p z ) = T ln q ( p ) ln q ( z ) ln q ( p ) d μ T 1 ln q ( p ) d μ ,
where ln q ( p ) denotes ln q ( t ) ( p ( t ) ) . Since the q-logarithm ln q ( u ) = u 1 q 1 1 q , has as derivative ln q ( u ) = 1 u q , we have that
T ln q ( p ) ln q ( z ) ln q ( p ) d μ = T p 1 q 1 1 q z 1 q 1 1 q 1 p q d μ = T p q ( p 1 q z 1 q ) 1 q d μ
and
T 1 ln q ( p ) d μ = T 1 1 p q d μ = T p q d μ .
Therefore
D ( p z ) = T p q ( p 1 q z 1 q ) 1 q d μ T p q d μ .
The divergence D ( p z ) in (25) is related with the q-divergence defined in (6). In fact,
I ( q ) ( p z ) = T z f p z d μ = T z p z ln q z p d μ = T p ln q ( z ) ln q ( p ) 1 + ( 1 q ) ln q ( p ) d μ = T p z 1 q p 1 q / ( 1 q ) 1 + ( 1 q ) p 1 q 1 ( 1 q ) d μ = T p q p 1 q z 1 q ( 1 q ) d μ .
Then D ( p z ) = I ( q ) ( p z ) T p q d μ and we can define the metric g : Σ ( P μ ) × Σ ( P μ ) F ( P μ ) as
g ( u , v ) = q T u v z d μ T z q d μ ,
where Σ ( P μ ) is the set of vector fields u : A c φ T p ( A c φ ) and F ( P μ ) the set of C functions f : A c φ R . This map is well defined, since u p D ( p z ) | p = z = 0 and v p D ( p z ) | p = z = 0 .
Notice that considering T p q d μ = 1 we will have that divergence in (25) coincides with the q-divergence defined in [15], the metric in (26) coincides with the metric given in [25] and the family of covariant derivatives (connections) given by
w q u = w u ( 1 q ) r u w + u A 2 B w C A 2 ,
where A = T z q d μ , B = w p A | p = z and C = u p A | p = z coincides with the family of covariant derivatives (connections) given in [25]. The notation w p A | p = z means the derivative of A in the direction of w in the point z when p = z .

5.2. Generalization of Divergence of Rényi and exp κ

Now, we will recall that the Rényi divergence is related with the φ -divergence and we will see that a necessary and sufficient condition for the existence of generalization of Rényi divergence is the condition (a2). Consequently, we prove that the q-deformed exponential and κ -exponential functions can be used in the generalization of Rényi divergence.
In [12] was defined a generalization of the Rényi divergence of order α ( 0 , 1 ) as
D R , φ ( α ) ( p z ) = κ ( α ) α ( 1 α ) ,
where κ ( α ) satisfies the Equation (12). This generalization in the case α { 0 , 1 } is defined as the limit
D R , φ ( 0 ) ( p z ) = lim α 0 D R , φ ( α ) ( p z )
and
D R , φ ( 1 ) ( p z ) = lim α 1 D R , φ ( α ) ( p z ) .
The limits in (28) and (29), under some conditions, are finite-valued and converges to the φ -divergence:
D R , φ ( 0 ) ( z p ) = D R , φ ( 1 ) ( p z ) = D φ ( p z ) < .
In the next proposition we have that a necessary and sufficient condition to connect two probability densities of P μ by an open arc is the condition (a2).
Proposition 9
([12], Proposition 1). Let μ be a non-atomic measure. Consider φ : R [ 0 , ) be a positive, deformed exponential function. Fix any α ( 0 , 1 ) . The condition (a2) is satisfied if, and only if, given p and z in P μ , there exists a constant κ ( α ) : = κ ( α ; p , z ) such that
T φ ( ( 1 α ) φ 1 ( p ) + α φ 1 ( z ) κ ( α ) ) d μ = 1 .
In the Example 1, where the measure μ was assumed to be non-atomic, we have that the q-exponential function satisfies the condition (a2). Then, by Proposition 9 and Equation (27), we conclude that this function can be used in the generalization of Rényi divergence. Analogously, the function given in the Example 2 cannot be used in the generalization of Rényi divergence.
Supposing that μ is non-atomic, it is presented on the next proposition an equivalent criterion for a deformed exponential function φ to satisfy condition (a2).
Proposition 10
([12], Proposition 3). Let φ : R [ 0 , ) be a deformed exponential function. Then (a2) is satisfied if, and only if,
lim sup u φ ( u ) φ ( u λ 0 ) < , f o r s o m e λ 0 > 0 .
In the next example, we will show a class of deformed exponential functions that can be used in the generalization of Rényi divergence.
Example 4.
We will show that the Kaniadakis κ-exponential exp κ ( . ) satisfies the condition (a3). The κ-exponential exp κ : R ( 0 , ) for κ [ 1 , 1 ] is defined as [26,27]
exp κ ( u ) = κ u + 1 + κ 2 u 2 1 κ , i f κ 0 , exp ( u ) i f κ = 0 .
Its inverse, the so called κ-logarithm log κ : ( 0 , ) R , is given by
log κ ( u ) = v κ v κ 2 κ , i f κ 0 , ln ( v ) i f κ = 0 .
We will verify that there exists α ( 0 , 1 ) and λ > 0 for which
λ log κ ( v ) log κ ( κ v ) , f o r a l l v > 0 .
Some manipulations imply that the derivative of log κ ( v ) log κ ( α v ) is negative for 0 < v v 0 and positive for v v 0 , where
v 0 = α κ 1 1 α κ 1 2 κ > 0 .
Consequently, the difference log κ ( v ) log κ ( α v ) attains a minimum at v 0 . Given α ( 0 , 1 ) , inequality (31) is satisfied for some λ > 0 . Inserting v = exp κ ( u ) into (31), we can write
α exp κ ( u ) exp κ ( u λ ) , f o r a l l u R .
If n N is such that n λ 1 , then a repeated application of (32) yields
α n exp κ ( u ) exp κ ( u n λ ) exp κ ( u 1 ) , f o r a l l u R .
Then,
lim sup u φ ( u ) φ ( u λ 0 ) = lim sup u exp κ ( u ) exp κ ( u 1 ) lim sup u 1 α n < .
Therefore, by Proposition 10 Kaniadakis κ -exponential exp κ ( . ) satisfies the condition (a2).
As consequence of the Example 4 and Proposition 9, we have that exp κ ( u ) can be used in the generalization of Rényi divergence.

6. Conclusions

In this paper we constructed a parametrization of the statistical Banach manifold using a deformed exponential function. We have found the tangent space of P μ in p and we also constructed the tangent bundle of P μ . We defined the φ -divergence where φ is the q-exponential function and we establish a relation between this divergence and the q-divergence defined in [15]. Another important contribution is that the q-exponential and κ -exponential functions can be used to generalize the divergence of Rényi. The perspective for future works is to define the parallel transport, once we find the tangent plane. We also intend to construct a parametrization for P μ using a deformed exponential function satisfying (a1) in the case where for each measurable function c : T R , with T φ ( c ) d μ = 1 , there exists a measurable function u 0 c : T R , such that T φ ( c + λ u 0 c ) d μ < , for each λ > 0 .

Author Contributions

Conceptualization, F.L.J.V., R.F.V. and C.C.C.; writing—original draft, F.L.J.V.; writing—review and editing, F.L.J.V., L.H.F.d.A., R.F.V. and C.C.C.

Funding

The authors would like to thank Coordenação de Aperfeiçoamento de Pessoal de Nível Superior-Brasil (CAPES)-Finance Code 001, Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) (Procs. 309472/2017-2 and 408609/2016-8) and FUNCAP (Proc. IR7-00126-00037.01.00/17).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Amari, S.I. Differential Geometry of Curved Exponential Families-Curvatures and Information Loss. Ann. Stat. 1982, 10, 357–385. [Google Scholar] [CrossRef]
  2. Amari, S.-I. Differential-Geometrical Methods in Statistics; Springer Science & Business: Berlin/Heidelberg, Germany, 2012; Volume 28. [Google Scholar]
  3. Pistone, G.; Sempi, C. An infinite-dimensional geometric structure on the space of all the probability measures equivalent to a given one. Ann. Stat. 1995, 23, 1543–1561. [Google Scholar] [CrossRef]
  4. Cena, A.; Pistone, G. Exponential statistical manifold. Ann. Inst. Stat. Math. 2007, 59, 27–56. [Google Scholar] [CrossRef]
  5. Pistone, G.; Rogantin, M.P. The exponential statistical manifold: mean parameters, orthogonality and space transformations. Bernoulli 1999, 5, 721–760. [Google Scholar] [CrossRef]
  6. Santacroce, M.; Siri, P.; Trivellato, B. New results on mixture and exponential models by Orlicz spaces. Bernoulli 2016, 22, 1431–1447. [Google Scholar] [CrossRef][Green Version]
  7. Naudts, J. Estimators, escort probabilities, and-exponential families in statistical physics. J. Ineq. Pure Appl. Math. 2004, 5, 102. [Google Scholar]
  8. Matsuzoe, H.; Wada, T. Deformed algebras and generalizations of independence on deformed exponential families. Entropy 2015, 17, 5729–5751. [Google Scholar] [CrossRef]
  9. Naudts, J. Generalised Thermostatistics; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
  10. Vigelis, R.F.; Cavalcante, C.C. On ϕ-families of probability distributions. J. Theor. Probab. 2013, 26, 870–884. [Google Scholar] [CrossRef]
  11. Eguchi, S.; Komori, O. Path connectedness on a space of probability density functions. In Proceedings of the International Conference on Geometric Science of Information; Springer: Cham, Switzerland, 2015; pp. 615–624. [Google Scholar]
  12. Vigelis, R.F.; de Andrade, L.H.F.; Cavalcante, C.C. On the Existence of Paths Connecting Probability Distributions. In Proceedings of the International Conference on Geometric Science of Information; Springer: Cham, Switzerland, 2017; pp. 801–808. [Google Scholar]
  13. de Souza, D.C.; Vigelis, R.F.; Cavalcante, C.C. Geometry induced by a generalization of Rényi divergence. Entropy 2016, 18, 407. [Google Scholar] [CrossRef]
  14. de Andrade, L.H.F.; Vieira, F.L.J.; Vigelis, R.F.; Cavalcante, C.C. Mixture and exponential arcs on generalized statistical manifold. Entropy 2018, 20, 147. [Google Scholar] [CrossRef]
  15. Loaiza, G.; Quiceno, H. A q-exponential statistical Banach manifold. J. Math. Anal. Appl. 2013, 398, 466–476. [Google Scholar] [CrossRef]
  16. Tsallis, C. What are the numbers that experiments provide. Quim. Nov. 1994, 17, 468–471. [Google Scholar]
  17. Musielak, J. Orlicz Spaces and Modular Spaces; Springer: Berlin/Heidelberg, Germany, 2006; Volume 1034. [Google Scholar]
  18. Rao, M.M.; Zhong, D.R. Theory of Orlicz Spaces; M. Dekker: New York, NY, USA, 1991. [Google Scholar]
  19. Krasnosel’skii, M.A.; Rutitskii, Y.B. Convex Function and Orlicz Spaces; Noordhoff: Groningen, The Netherlands, 1961; Translated from Russian. [Google Scholar]
  20. Lang, S. Introduction to Differentiable Manifolds; Springer Science and Business Media: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
  21. Van Erven, T.; Harremoës, P. Rényi divergence and Kullback-Leibler divergence. IEEE Trans. Inform. Theoy 2014, 60, 3797–3820. [Google Scholar] [CrossRef]
  22. Bregman, L.M. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 1967, 7, 200–217. [Google Scholar] [CrossRef]
  23. Zhang, J. Divergence function, duality, and convex analysis. Neural Comput. 2004, 16, 159–195. [Google Scholar] [CrossRef] [PubMed]
  24. Korbel, J.; Hänel, R.; Thurner, S. Information geometric duality of ϕ-deformed exponential families. Entropy 2019, 21, 112. [Google Scholar] [CrossRef]
  25. Loaiza, G.; Quiceno, H. A Riemannian geometry in the q-exponential Banach manifold induced by q-divergences. In Proceedings of the International Conference on Geometric Science of Information; Springer: Cham, Switzerland, 2013; pp. 737–742. [Google Scholar]
  26. Kaniadakis, G. Non-linear kinetics underlying generalized statistics. Physics A 2001, 296, 405–425. [Google Scholar] [CrossRef][Green Version]
  27. Pistone, G. Kappa-exponential models from the geometrical viewpoint. Eur. Phys. J. B 2009, 70, 29–37. [Google Scholar] [CrossRef]

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top