A Deformed Exponential Statistical Manifold

Consider μ a probability measure and Pμ the set of μ-equivalent strictly positive probability densities. To endow Pμ with a structure of a C∞-Banach manifold we use the φ-connection by an open arc, where φ is a deformed exponential function which assumes zero until a certain point and from then on is strictly increasing. This deformed exponential function has as particular cases the q-deformed exponential and κ-exponential functions. Moreover, we find the tangent space of Pμ at a point p, and as a consequence the tangent bundle of Pμ. We define a divergence using the q-exponential function and we prove that this divergence is related to the q-divergence already known from the literature. We also show that q-exponential and κ-exponential functions can be used to generalize of Rényi divergence.


Introduction
Let P µ be the set of µ-equivalent strictly positive probability densities, where µ is a given probability measure. In order to build a structure to P µ , Amari considered the parametric case, where the construction depends on a parameter belonging to the Euclidean space [1,2]. The case of non-parametric statistical models was initially studied by Pistone and Sempi [3]. In this case, P µ was equipped with a structure of a C ∞ -Banach manifold using the Orlicz space associated to an Orlicz function. In a later work [4], Pistone and Cena proved that the probability distribution z belongs to the maximal exponential model to the probability distribution p, if and only if, z is connected to p by an open exponential arc. Moreover, the new manifold structure obtained from the connection by an open exponential arc is equivalent to the one defined in [3,5]. Results involving conditions connecting two probability densities by an open exponential arc were recently studied in [6].
The deformed exponential function was first introduced by Naudts in [7] and studied in more details later in [8,9]. In [10], the authors propose a generalization for the exponential family E p , based in the replacement of the exponential function exp by a deformed exponential function ϕ. It is then proposed a ϕ-family of probability distributions denoted by F

Background and Preliminary Results
The deformed exponential function that we will use to equip P µ with a structure of a C ∞ -Banach manifold has as a particular case the q-exponential function and the parametrization domain is obtained from a Musielak-Orlicz space. For this reason, the purpose of this section is to make a brief presentation of the results involving the q-exponential manifold and the Musielak-Orlicz spaces.
Consider the following partition of P µ into equivalence classes: p, z ∈ P µ are related (p ∼ q z) if and only if there exists an one-dimensional q-exponential model connecting p and z, according to Equation (2). As a consequence, the measures p.µ and z.µ are equivalent and the essentially bounded function spaces L ∞ (p.µ) and L ∞ (z.µ) are equal.
We need to define a family of q-deformations of the moment-generating functional denoted by M q p , it means, M q p : Also, we define a family of cumulant generating functional (1) The function z = e u q K q p (u) q p is a probability density on P µ , since u ∈ B p,∞ (0, 1); (2) K q p is infinitely Fréchet differentiable and its n-th derivative evaluated at the directions (v 1 , . . . , v n ) ∈ B p,∞ (0, 1) × . . . × B p,∞ (0, 1), is of the form (3) The functional K q p is analytic in B p,∞ (0, 1).
The function K q p is used to define the q-exponential models Moreover, the set is a Banach space and is the open unit ball of B p . Since ||u|| p,∞ < 1, we obtain −1 The inverse of e q,p is given by [15] The transition map e −1 q,p 2 • e q,p 1 : , where U p is the range of e q,p , is expressed as [15] e −1 q,p 2 (e q,p 1 (u)) = . The map e q,p is injective and the set e −1 q,p (U p 1 ∩ U p 2 ) is open in the B p 1 -topology, where p 1 , p 2 ∈ P µ . Hence, the transition map e −1 q,p 2 • e q,p 1 is a topological homeomorphism and consequently the collection of pairs U p , e −1 q,p p∈P µ is a C ∞ -atlas modeled on B p . Then, P µ is a C ∞ -Banach manifold, since e q,p is a parametrization. There exists a relation between the constructed manifold and the Tsallis relative entropy. In fact, let us consider, for t = 0 and 0 < q < 1, the following function where ln q (x) = x 1−q −1 1−q , if x > 0. Given p and z in P µ , the Tsallis divergence, also called q-divergence of z with relation to p, is expressed by Proposition 1 ([15], Proposition 16). Taking p, z in P µ , we obtain (1) I (q) (z||p) ≥ 0 , with equality iff p = z.
Since the items (1) and (2) occur, it follows that Φ(t, .) is not equal to 0 or ∞ in the interval (0, ∞). Consider the functional I Φ (u) = T Φ(t, |u(t)|)dµ, for any u ∈ L 0 . The Musielak-Orlicz space, Musielak-Orlicz class, Morse-Transue space associated the a Musielak-Orlicz function Φ are defined, respectively, by Consider the Luxemburg norm and the Orlicz norm is the Fenchel conjugate of Φ(t, ·). The Musielak-Orlicz space L Φ equipped with one of these two norms is a Banach space. The norms above are equivalent and the inequalities u Φ ≤ u Φ,0 ≤ 2 u Φ hold for all u ∈ L Φ . For more details see [18,19]. Define the Musielak-Orlicz function as where c : T → R is a measurable function such that ϕ(t, c(t)) is µ-integrable and we write L ϕ c ,L ϕ c and E ϕ c , in the place of L Φ c ,L Φ c and E Φ c respectively. In [10] it was defined the parametrization The application ψ : B ϕ c → [0, ∞) is called the normalizing function and it is defined in such a way that ϕ c (u) are open for any c 1 , c 2 : T → R measurable such that ϕ(c 1 ) and ϕ(c 2 ) are in P µ . The transition map is a C ∞ -isomorphism and consequently ϕ c is a parametrization.
In the next section, we will use the generalized open exponential arcs to build a parametrization to P µ .

Construction of Generalized ϕ-Families of Probability Distributions
Let (T, Σ, µ), be a σ-finite, non-atomic measure space and consider a deformed exponential function ϕ : T × R → [0, ∞). In other words, ϕ(t, ·) is convex for µ-a.e. t ∈ T and the limits lim u→−∞ ϕ(t, u) = 0, lim u→∞ ϕ(t, u) = ∞ for µ-a.e. t ∈ T hold. In this work we consider two additional conditions on the deformed exponential ϕ: For a measurable function q : T → (0, 1), we define the q-deformed exponential function exp q : In this case, the q-deformed exponential function satisfies the condition (a1) with a ϕ = −1 1−q . In the next example, we prove that the q-deformed exponential function satisfies the condition (a2) for 0 < q < 1.

Example 1.
Given α ≥ 1, we consider two cases: If u ≤ 0, we have that αu ≤ u. Then, By the convexity property of exp q (t, .), we obtain for any λ ∈ (0, 1) that Then, any positive function u 0 : Now, we provide an example of a deformed exponential function that satisfies condition (a1), but does not satisfy condition (a2).

Example 2. Consider the function
where the measure µ is σ-finite and non atomic. Note that ϕ is convex, and satisfies ϕ(x) = 0, for all x < a ϕ , where a ϕ = inf{x ∈ R; ϕ(x) > 0} and lim u→∞ ϕ(u) = ∞. We will find a measurable function c : According to [17], there exists a subsequence w k = v m n k and pairwise disjoint sets A k ⊆ E m n k for which Hence, we can write On the other hand, we also have which shows that (a2) is not satisfied.
According to the proof proved in [11], we have that κ(α) ≤ 0 for each α ∈ [0, 1]. Indeed, Integrating the inequality we obtain then κ(α) ≤ 0, for α ∈ [0, 1]. Now we will define, by using generalized exponential arcs, important sets for the construction of generalized ϕ-family of probability distributions. Let us define as p and z are ϕ-connected by an open arc, we have that ( We will show that the set is a generalized ϕ-family of probability distributions. Consider the partition of P µ into equivalence classes using the following relation: given p, z ∈ P µ we say that p ∼ z if and only if p and z are ϕ-connected by an open arc. This equivalence relation is necessary to define an atlas modeled on Banach spaces.
Define the set For α ∈ − ε 2 , 0 , we can write for any α ∈ − ε 2 , 1 + ε 2 . Hence, u + v ∈ K ϕ c and since N ϕ c is a subspace, we obtain u + v ∈ K  (14) is important to guarantee that ϕ(c + αu) may be in P µ . Now, we establish a relationship between the connection by an open arc and K ϕ c similar to that was proved in [14]. Proposition 2. Fix p ∈ P µ . We say that z ∈ P µ is ϕ-connected to p by an open arc, if and only if, there exists an open interval I ⊃ [0,1] and a random variable u ∈ L ϕ c , such that p(α) ∝ ϕ(c + αu) ∈ P µ , for each α ∈ I and p(0) = p and p(1) = z.
One should notice that as a consequence of Proposition 2, given p, z ∈ P µ ϕ-connected by an open arc, the random variable u ∈ K ϕ c = K ϕ c ∩ N ϕ c . In fact, this follows from two reasons: as p, z ∈ P µ it follows that ϕ −1 (p), ϕ −1 (z) > a ϕ and as z is ϕ-connected the p by an open arc we have T ϕ(c + αu)dµ < ∞ for each α ∈ (−ε, 1 + ε).

Remark 1.
Since the function ϕ-arc is injective, in the Proposition 2 only the case z = p is considered. Therefore, there exists z ∈ A ϕ c such that z = p.
is then well defined. Moreover, V(λ) is strictly increasing.
By Corollary 3 the sets A ϕ c are the connected components of P µ . Then, we need to find a domain for the parametrization in such a way that the image is A ϕ c . We will make some similar considerations to the ones present in [10] Remark that, for u ∈ K ϕ c , ϕ(c + u) is not necessarily in P µ . Define ψ : K ϕ c → R, such that the density is contained in P µ . We have that the open domain maximal of ψ is contained in K ϕ c . Note that ψ is well defined, since c + u − ψ(u) > a ϕ , µ-a.e. t ∈ T. It can be then proved that ψ : K ϕ c → R is convex, and as a consequence ψ : K ϕ c → R is continuous, since K ϕ c is open by Lemma 2. Let ϕ + be the operator acting on the set of real-valued functions u : T → R given by ϕ + (u)(t) = ϕ + (t, u(t)), where ϕ + (t, .) is the right-derivative of ϕ(t, .). Also, notice that the function ψ : K ϕ c → R can assume both positive and negative values. Consider the closed subspace Observe that the image of ψ will be contained in [0, ∞), since the domain of ψ is restricted to a B ϕ c . By the convexity property of ϕ(t, .), we have Hence, we have that Thus, it follows that ψ(u) ≥ 0 in order to ϕ(c + u − ψ(u)) be in P µ . Given a measurable function c : T → R such that p = ϕ(c) is a probability density in P µ . Consider the set M Proof. Given u ∈ M ϕ c , we have c + α(u − ψ(u)) + κ(α)) > a ϕ and T ϕ(c + αu − (αψ(u) + κ(α)))dµ < 1, µ-a.e. t ∈ T, for each α ∈ I ⊃ [0,1]. Hence, for each α ∈ I ⊃ [0,1], which implies in ϕ(c + u − ψ(u)) ∈ R ϕ c . In addition, for each α ∈ I ⊃ [0,1] and therefore, ϕ(c + u − ψ(u)) ∈ A ϕ c .
Proof. Consider the sets Define the functions 1.
The function f is well defined and continuous, since ψ : K ϕ c → R is continuous; 2.
The map g is well defined in (M By the continuity of f and g respectively, exist ε 1 , ε 2 ∈ (0, 1), such that for each Multiplying the Equation (18) by (ϕ) + (c 2 ) and integrating with respect to the measure µ, once the function v is in M ϕ c 2 , we obtain and we can write Hence, the transition map ϕ −1 c 2 • ϕ c 1 can be expressed as for every w ∈ ϕ −1 Showing that w and c 1 − c 2 are in L ϕ c 2 and the spaces L ϕ c 1 and L ϕ c 2 have equivalent norms we obtain that this transition map will be of class C ∞ .
In the next corollary we have that Musielak-Orlicz spaces are equal. The proof follows as the one provided in [14]. Proof. We have that z is ϕ-connected to p by a open arc. Then, by Corollary 3, we have that c = c + u − ψ(u). The result follows immediately from [10].
It follows from Corollary 1 that ϕ −1 c 2 • ϕ c 1 is of class C ∞ , and consequently, the set ϕ −1 Proposition 6 ([14], Proposition 8). The relation given in the Definition 2 is an equivalence relation.
Proof. Since reflexivity and symmetry properties immediately follow from the definition, we will only prove transitivity. Let be p, z, s ∈ P µ , such that, Therefore z and s are ϕ-connected.
As a consequence of the Corollary 3 and of the Proposition 6 we have that the ϕ-families A ϕ c are maximal, in the sense that A Hence, we can write the following proposition.

The Tangent Bundle
In the previous section, the expression of the transition application ϕ −1 c 2 • ϕ c 1 was important to garantee that P µ could be equipped with a C ∞ -Banach structure. Now, we will use the transition application to find the tangent space of P µ at the point p = ϕ(c) and the tangent bundle.
Given p ∈ P µ , we consider the triple (A Let us define the following equivalence relation: is called the tangent vector of P µ in p and the set of all classes is called the tangent space and is denoted by T p (P µ ). For more details we refer the reader to [20].
The vector v ∈ ϕ −1 c (A ϕ c ) is the velocity vector of a curve in the parametrization domain. In fact, be charts about p ∈ P µ and g : I ⊂ T → P µ a curve such that g(t 0 ) = p, for some t 0 ∈ T. Taking g(t) = ϕ c 1 (u 1 ) = ϕ(c 1 + u 1 − ψ(u 1 )), we have that u 1 (t) = ϕ −1 c 1 (g(t)). Moreover, g(t) = ϕ c 1 (u 1 ) and u 2 (t) = ϕ −1 c 2 (g(t)). Using random variables we have that ). Hence, by the chain rule we can write We will denote τ(P µ ) as the tangent bundle, which is defined as the disjointed unity of T p (P µ ), that is, Proof. Given w ∈ ϕ −1 In fact, by the convexity of ϕ, we have that is µ-integrable, and consequently, Then, from the dominated convergence theorem follows that (21) occurs.
The tangent bundle is then denoted by c ⊂ P µ and v is a tangent vector to ϕ c (u)}.
Its charts are expressed as In [12] was defined a generalization of the Rényi divergence of order α ∈ (0, 1) as where κ(α) satisfies the Equation (12). This generalization in the case α ∈ {0, 1} is defined as the limit and D (1) The limits in (28) and (29), under some conditions, are finite-valued and converges to the In the next proposition we have that a necessary and sufficient condition to connect two probability densities of P µ by an open arc is the condition (a2). Proposition 9 ([12], Proposition 1). Let µ be a non-atomic measure. Consider ϕ : R → [0, ∞) be a positive, deformed exponential function. Fix any α ∈ (0, 1). The condition (a2) is satisfied if, and only if, given p and z in P µ , there exists a constant κ(α) := κ(α; p, z) such that In the Example 1, where the measure µ was assumed to be non-atomic, we have that the q-exponential function satisfies the condition (a2). Then, by Proposition 9 and Equation (27), we conclude that this function can be used in the generalization of Rényi divergence. Analogously, the function given in the Example 2 cannot be used in the generalization of Rényi divergence.
Supposing that µ is non-atomic, it is presented on the next proposition an equivalent criterion for a deformed exponential function ϕ to satisfy condition (a2). < ∞, for some λ 0 > 0.

Conclusions
In this paper we constructed a parametrization of the statistical Banach manifold using a deformed exponential function. We have found the tangent space of P µ in p and we also constructed the tangent bundle of P µ . We defined the ϕ-divergence where ϕ is the q-exponential function and we establish a relation between this divergence and the q-divergence defined in [15]. Another important contribution is that the q-exponential and κ-exponential functions can be used to generalize the divergence of Rényi. The perspective for future works is to define the parallel transport, once we find the tangent plane. We also intend to construct a parametrization for P µ using a deformed exponential function satisfying (a1) in the case where for each measurable function c : T → R, with T ϕ(c)dµ = 1, there exists a measurable function u 0c : T → R, such that T ϕ(c + λu 0c )dµ < ∞, for each λ > 0.