Abstract
In this paper, we investigate the mixture arc on generalized statistical manifolds. We ensure that the generalization of the mixture arc is well defined and we are able to provide a generalization of the open exponential arc and its properties. We consider the model of a -family of distributions to describe our general statistical model.
1. Introduction
In the geometry of statistical models, information geometry [1,2,3] is the part of probability theory dedicated to investigate probability density functions equipped with differential geometry structure. A differential-geometric structure to the multi-parameter families of distributions was provided in [4]. In the mid-1980s, other topics related to the subject, such as fiber bundle theory and duality of connections of statistical models, were investigated by Amari [5] and Amari and Nagaoka [6], respectively. In the parametric case, exponential, mixture and -connections, as well as their dual structure, are among the most important geometric objects [6], since the dual structure of the -connections is the key point distinguishing statistical manifolds against arbitrary differential manifolds. Divergence function is an essential topic in information geometry, for both, parametric and non-parametric cases, since a metric and dual connections can be induced from a divergence [7,8,9,10]. To find an information-geometrical foundation for multi-parameter families of probability distributions, with a more general description, is one of topics of interest in information geometry [11,12,13,14]
Non-parametric statistical models [15] are important in a wide range of areas [16,17]. In the parametric case, the manifold of probability density functions obtains a Euclidian topology from the space of its natural parameters. As for the non-parameter case, a major challenge is to define a convenient topology and a notion of convergence. Pistone and Sempi [18] were the first to formulate a rigorous infinite dimensional extension. In that work, the set of all strictly positive probability densities was endowed with a structure of exponential Banach manifolds, using Orlicz spaces associated to a Young function. In a later work [19], more properties of the statistical manifold were studied, specifically regarding the orthogonality condition.
Similar to in the parametric case, in non-parametric models, the mixture and exponential connections are among the most important geometric objects. To find these connections, it is necessary to guarantee the existence of the open arcs, which are the geodesics of the manifold. Using the notion of exponential convergence, Gibilisco and Pistone [20] investigated those connections. In that work, the exponential and mixture connections were built in a way that the relation between them is the same as in the parametric case. Another approach was used in [21] where the mixture arc was additionally studied. Moreover, Grasselli [21] proved that two probability densities in the same neighborhood are connected by an open mixture arc if and only if the difference between their random variables is bounded.
The exponential statistical manifold was later studied in [22], with another system of charts, the statistical model , called the maximal exponential model. Cena and Pistone [22] proved that this model is the set of all positive densities connected to a given positive density p by an open exponential arc and viceversa. In that work, it was used the open mixture arc and the open exponential arc to discuss properties of this model as e-connection and m-connection in the same way that in [6]. This exponential model with the open exponential and mixture arcs were also studied recently by Santacroce et al., 2016 [23] and Santacroce et al., 2017 [24], where a proof of duality properties of statistical models was provided. Examples of applications of non-parametric information geometry to statistical physics using the connection by open arcs were studied in [25].
The generalization of the exponential statistical manifold has been an active topic of research in the last years. Pistone [26] used the Kaniadaki’s -exponential [27] in the construction of a statistical manifold. Vigelis and Cavalcante [28] proposed a -family of probability distributions , which generalizes the exponential family . This generalization is based on the replacement of the exponential function by a deformed exponential which satisfies some properties and provides to the set a Banach manifold structure, so called generalized statistical manifold. In [29], a review of nonparametric information geometry with specific issues of the infinite dimensional setting is provided. In that work, the deformed exponential manifold was studied with a deformed exponential function defined in [30] and a model space was built according to the proposal in [28].
In [31] were given necessary and sufficient conditions for any two probability distributions being connected by a -arc. In this work, we ensure the existence of a generalized mixture arc for probability distributions in the same -family , with a deformed exponential function which satisfies some properties. Moreover, we find a generalization of open exponential arcs and we prove, in the same way that in [22], that the -family is the component connected to a given positive density and viceversa.
The rest of the paper is organized as follows. In Section 2, we revisit results about Musielak–Orlicz space and -family of probability distributions. We also briefly recall about the subdifferential of a convex function. In Section 3, where we provide our main results, we ensure that the generalized mixture arc is well-defined. In Section 4, we discuss the generalized, exponential and mixture arcs. Finally, our conclusions and perspectives are stated in Section 5.
2. Preliminary Results
The statistical manifold can be equipped with a structure of -Banach Manifold, using the Musielak–Orlicz space associated to the Musielak–Orlicz function . Each connected component of the statistical manifold gives rise to a -family of probability distributions . In this section, we provide an introduction of Musielak–Orlicz spaces and the construction of the -family of probability distributions.
2.1. -Families of Probability Distributions
Let be a -finite, non-atomic measure space. A function is said to be a Musielak–Orlicz function if
- (i)
- is convex and lower semi-continuous for -a.e. (almost everywhere) ,
- (ii)
- and for -a.e. ,
- (iii)
- is measurable for each .
We notice that , by (i)-(ii), is not equal to 0 or ∞ on the interval .
Let be the linear space of all real-value, measurable functions on T. Given a Musielak–Orlicz function , we denote the functional , for any . The Musielak–Orlicz space, Musielak–Orlicz class, Morse–Transue space generated by a Musielak–Orlicz function are defined, respectively, by
and
The Musielak–Orlicz space is a Banach space when it is equipped with the Luxemburg norm given by
or the Orlicz norm, represented as
where is the Fenchel conjugate of , which is also a Musielak–Orlicz function. These norms are equivalent and the inequalities hold for all [32]. A Musielak–Orlicz function is said to satisfy the -condition, or belong to the -class (denoted by ), if we can find a constant and a non-negative function such that
If the Musielak–Orlicz function satisfies the -condition, then for every [32]. In this case , and are equal as sets. Moreover, if the Musielak–Orlicz function does not satisfy the -condition, is a proper subspace of . Every function that satisfies the -condition is finite-value. Indeed, we define
and assuming that , we get for all which implies that cannot satisfy the -condition. For more information see for instance [32,33].
We say that a Musielak–Orlicz function satisfies the -condition, or belongs to -class, if we can find a constant , and a non-negative function such that
We notice that, if , then
Example 1.
The function defined by:
satisfies the -condition and does not satisfy the -condition.
The (topological) dual space of , is denoted by and represented in the following way [32,34,35]
where is the set of the order continuous functionals and is formed by singular components. If the Musielak–Orlicz function then all functionals in are order continuous and represented by
Otherwise, if , then the functionals f in can be uniquely expressed as
where is the order continuous component and is the singular component.
While exponential families are based on the exponential function, -families are based on deformed exponential functions. A deformed exponential is a function that satisfies the following properties, for -a.e. [28]:
- (i)
- is convex and injective;
- (ii)
- and ;
- (iii)
- There exists a measurable function such that
for every measurable function for which .
In de Souza et al. [36], Lemma 1, it was shown that the constraint can be replaced by . Thus, the condition (iii) can be rewritten as:
- (iii’)
- There exists a measurable function such that
for every measurable function for which .
There are many examples of deformed exponential functions. An example of relevance is the exponential function that satisfies (i)-(iii) with . Another example is Kaniadakis’ -exponential [26,27,28]:
Example 2.
The Kaniadakis’ κ-exponential for is defined as
The inverse of is the Kaniadakis’ κ-logarithm
One can easily notice the κ-exponential satisfies – [28,36].
The Musielak–Orlicz function
for a measurable function such that is -integrable, was defined in [28]. Thus, the sets , and are denoted by , and , respectively, when is given by (5). Let
be the collection whose -family is a subset, where is the linear space of all real-valued. For each probability density , we have a -family of probability density associated, according to
where the set is the intersection of the convex set
with the closed subspace
that is . The normalizing function is introduced so that expression (6) is a probability distribution in . Suppose that the Musielak–Orlicz function does not satisfy the -condition, we have that the boundary of , the set , is not empty. A function belongs to if only if for all , and for each . The behavior of the normalizing function near the boundary was studied in [33,37].
It is shown that the normalizing function is a convex function [28]. Assuming that is continuously differentiable, the normalizing function is Gâteaux-differentiable and the expression for Gâteaux-derivative is
with and .
In the next section, we recall some differentiability properties of convex functions on infinite dimensional spaces.
2.2. The Subdifferential of a Convex function
In this section, we discuss some properties of extended real-valued convex functions in Banach spaces, i.e., functions with values in . Mainly, we recall subdifferentials of lower semicontinuous convex functions and its properties.
Let E be a Banach space. A function f is a convex function on E, with the epigraph [38]
If for every x and for at least one value of x, we call f a proper function. The set
denotes the effective domain of f. A function is said to be lower semicontinuous (l.s.c.) if for every the set
is closed.
Let be the dual space of E. A vector is said to be a subgradient of f at if
We denote by the set of subgradients of f at x and the subdifferential of f is the multivalued mapping from E to . By definition, is always a closed convex subset of for each x. Suppose f is a convex function finite at x. One has if and only if
where
is the directional derivative of f at x in direction . The subdifferential may be empty at points of , so we denote by
the domain of and we have that . We say that f is subdifferentiable at x for all .
Let f be a lower semicontinuous proper convex function, then [39] (Corollary 2.38). The conjugate of f is the function defined by
Observe that, if f is proper, then “sup” in Equation (9) may be restricted to the points . The conjugate is a convex and lower semicontinuous function on and jointly with f satisfy the well known Young’s inequality
with equality holding if and only if . If f is a lower semicontinuous function, the subdifferential of the conjugate function coincides with ([39], Proposition 2.33).
It is known that, if f is a lower semicontinuous proper convex function, then
and it was shown in [40] that is, in fact, dense in .
Fact 1
(([41], Corollary 2.19), ([42], Corollary 7.2.3)). Suppose . Then if only if is locally bounded at x.
Fact 2
(([41], Lemma 2.20), ([42], Lemma 7.2.4)). If and , then is unbounded.
The subdifferential of a convex function is closely related to Gâteaux-gradient. If the convex function f is Gâteaux-differentiable in , then consists of a single element ([39], Proposition 2.40), where is the Gâteaux-gradient of f at x.
In the next section, we investigate the subdifferential of the normalizing function . This result will be useful for us to prove that the generalized mixture arc is well defined, which is one of our main goals in this work.
3. Construction of Generalized Mixture Arcs
The normalizing function is convex and Gâteaux-differentiable and this derivative is given by Equation (8). Hence, with these facts in mind, we can provide the expression for the generalized mixture arc as given by:
where
and belong to a -family . We can rewrite the functional as
with and Equation (13) is the Gâteaux-gradient of . Thus, for the generalized mixture arc to be well defined, it is necessary that the set of these functionals in Equation (13) be convex. As mentioned in Section 2.2, the subdifferential and Gâteaux-gradient are closely related. For this reason, we investigate the subdifferential of .
3.1. Subdifferential of the Normalizing Function
Considering that the Musielak–Orlicz function (5) does not satisfy the -condition, then we have that is not-empty [33]. The effective domain of the normalizing , the set is
where is the set of points in the boundary of such that . The behavior of the normalizing function near the boundary of was discussed in [33]. We need to know the subdifferentials of . Hence, we have to prove some properties of , then we have our first result.
Proposition 1.
The normalizing function is lower semicontinuous.
Proof.
Given , let be the set . To prove the statement, it suffices to show that is closed. We define a set
and we are going to prove that B is a closed set and that . Let be a sequence which belongs to B, such that . This way, , -a.e. Since is a continuous function, we have that , -a.e. From Fatou’s Lemma, it follows that
thus, and B is a closed set. Now, we prove that . Let u be a function which belongs to , then . The function is a strictly increasing function, so that
thus, .
Suppose that there exists , then , which implies that and , which implies that . Then
thus . This contradicts the assumption that . Therefore, and is closed. ☐
The subdifferential of at a function is the set
where denotes the dual space of . We know that, for all the normalizing function is Gâteaux-differentiable and the Gâteaux-gradient is given by Equation (13). Hence, consists of a single element and is given by
In fact, we prove below that Equation (13) belongs to , for all .
Proposition 2.
Proof.
We have that the functional (16) belongs to . Let v be a function in such that . In other words, , so we have that and . Thus, by the convexity of , we have
Thus,
and
If , then , and
Consequently, Inequality (17) holds for all and the result follows. ☐
We need to find the subdifferential of for u in the set . We know that is a proper lower semicontinuous convex function, so
where and . As we have that , then for , is unbounded.
Since we are interested to prove that the set of functionals in Equation (13) is convex and these functionals are order continuous, we need to analyze only the order continuous part of the subdifferential, i.e., the part of the subdifferential that belongs to . We need to investigate whether the functional in Equation (16) belongs to , for . For this, we will use some results.
Lemma 1
([35], Lemma 3.11). Let be a Musielak–Orlicz function that does not satisfy the -condition. In addition, assume that for μ-a.e. . Then there exist a strictly increasing sequence , and sequences and of finite-valued, non-negative, measurable functions, and pairwise disjoint, measurable sets, respectively, such that
Proposition 3
([43], Proposition 2.3). Let Φ and Ψ be Musielak–Orlicz functions. Suppose that, for constants α, , there exists an integrable function such that
Then, for constants and , or and , a non-negative function can be found such that
Lemma 2.
Let and denote the complementary functions to the Musielak–Orlicz functions Φ and Ψ, respectively. Suppose that, for constants , there exists a non-negative function such that
Then, for constants and , or and , a non-negative function can be found such that
Proof.
Defining the function , we can write
Calculating the Fenchel conjugate of the functions in the inequality above, we obtain
From Proposition 3, we infer that Equation (19) is satisfied. ☐
Lemma 3.
The -condition is equivalent to the statement that, for every , there exist a constant , and a non-negative function such that
The -condition is equivalent to the statement that, for any , there exist a constant , and a non-negative function such that
Proof.
Suppose it satisfies the -condition. If the natural number is such that , then , for all . Conversely, if satisfies Equation (20) and the natural number is chosen so that , then , for all .
The next result follows from Lemmas 2 and 3.
Theorem 1.
A Musielak–Orlicz function satisfies the -condition if, and only if, its complementary function satisfies the -condition.
Proposition 4.
Let be a Musielak–Orlicz function that does not satisfy the -condition and that for μ-a.e. . Then we can find a non-negative function such that .
Proof.
Let , and be given as in Lemma 1. Select a subsequence for which the series converges, and for all . Because is continuous for , we can find such that . Define . Then, we can write
and
Hence, it follows that
which concludes the proof. ☐
The previous proposition makes it clear that we can find a , but . Let u be as in Proposition 4, clearly for , and for , ([35], Remark 3.12).
Proposition 5.
Let be a Musielak–Orlicz function such that, satisfies -condition, does not satisfy -condition and . Then we can find such that
where is the Musielak–Orlicz class of , the conjugate of .
Proof.
Take and denote , then we define . We can choose such that
satisfies . In other words, . It is easy to see that for and for , so . The need to show Equation (22) remains. From Proposition 4 we have that
since
Thus,
consequently . Since , we have that and therefore . We conclude that . Since is a linear set, we have that Equation (22) occurs. ☐
As a consequence of Proposition 5, we have that it is possible to find such that
and therefore the functional in Equation (16) does not belong to .
We conclude in this section that, if the functional
belongs to , then the functional belongs to for .
In next section we finally prove that the set of functionals formed by Gâteaux gradient of the normalizing function that belongs to is convex, so we can guarantee that the generalized mixture arc is well defined.
3.2. Convexity of the Functionals Set
We already know that, for the generalized mixture arc in Equation (11) to be well defined, it is necessary that the set of functionals
to be convex. From Proposition 2, the set in Equation (24) is contained in the range of , the set given by
Let be the conjugate function of . By the fact that be a l.s.c. proper convex function, and are convex sets and the range of is the effective domain of , since . Thus
is the same that
To prove that the set in Equation (24) is convex, we analyze the set in Equation (25) in three cases. Let be elements in Equation (25) such that
- Case 1.
- , so by convexity of , for , we have .
- Case 2.
- If and , then , for ([41], Fact 2.1).
- Case 3.
- Let be elements in Equation (25) belonging to .
We want to prove that, for , belongs to Equation (25). To solve this problem, we are going to prove that . Supposing a strictly convex function, then is a strictly convex function. In next proposition, we show that is a unitary set.
Proposition 6.
Let ψ be a strictly convex function, then is a unitary set, where , with .
Proof.
Assuming that is a strictly convex function we have that for and
Supposing that is not a unitary set, i.e., , where , . Taking , . By Young’s Inequality (10)
where and as a consequence of we have
and
Thus, the set is unitary, then is locally bounded at and, therefore, by Fact 1, we conclude that which implies that , by Equation (26), we have that
Therefore, by Fact 2, there exists no functional in Equation (25) such that . Thus Equation (25) is a convex set and, as a consequence, the generalized mixture arc is well defined, since the set in Equation (24) is a convex set. Indeed, let u, v be functions in such that
and
belong to Equation (24). Clearly,
and
We note that, the functionals in Equation (24) are the only elements in Equation (25) that satisfy . For we have
then there exist functions such that
Thus, the set in Equation (24) is a convex set.
In this section, we proved that the generalized mixture arc is well defined for a deformed exponential strictly convex. In the next section, we discuss generalized open exponential arcs and generalized open mixture arcs.
4. Generalized Arcs
The concept of arc-connected probability distributions was defined by de Souza et al. [36] defined the concept of arc-connected probability distributions. Fixing any deformed exponential we say that two probability distributions are -connected if, for each , there exists such that
In [31], necessary and sufficient conditions for any probability distributions being -connected were provided. In this section, we discuss the concept of two probability distributions are -connected by open arcs. We generalize open exponential arcs and open mixture arcs, defined in [22] and studied later in [23].
4.1. Generalized Open Exponential Arcs
Let us define the generalized open arcs and prove some of its properties.
Definition 1.
For a fixed deformed exponential φ, we say that p and q in are φ-connected by an open arc if there exists an open interval and a constant such that
belongs to for every , where depends of .
In the following proposition, we give an equivalent definition of -connection by open arc.
Proposition 7.
are φ-connected by an open arc if and only if there exist an open interval and a random variable , such that belongs to , for all and and .
Proof.
Let us assume that are -connected, i.e., , for all Since
where and , then . Moreover, belongs to , for every and and . The converse follows immediately. Suppose that , we have , then , with . ☐
Because of the need to define the open arcs arises. As a consequence of Proposition 7, we have that if are - connected by an open arc, then the random variable , since for all . With this, we can prove the following results.
Corollary 1.
Let , where . We have that if and only if, p and q are φ-connected by an open arc.
Proof.
Supposing , then where . Thus, we have for all , we deduce that is an open arc containing p and q. Conversely, supposing that p and q are -connected by an open arc, by Proposition 7, there exist an open interval and such that belongs to with . If , then and the proof is over. Otherwise, let w be such that
thus and . Hence, we have and . ☐
With this, we prove that, for , the -family of probability distributions is the set of all such that q is -connected by an open arc to p.
Corollary 2.
Let and be such that are φ -connected by an open arc. Then the spaces and are equal as sets.
Proof.
It follows from Corollary 1 that p and q are in the same -family, then and by Vigelis and Cavalcante [28], Lemma 5, it follows the result. ☐
Now, we show that the connection by generalized open exponential arcs is an equivalence relation.
Proposition 8.
The relation in Definition 1 is an equivalence relation.
Proof.
Reflexive and symmetry properties follow from the definition and now, we prove transitivity. Consider
with , , , with . We have that p is -connected to q and r, respectively. We need to prove that q and r are also -connected. Consider
is defined with , , , such that , . Therefore, q and r are -connected. ☐
We know from Corollary 1 that the -family coincides with the set of all which are -connected to p by an open arc. We want now to prove that the -family is convex for some deformed exponential .
Lemma 4.
Let φ be a fixed deformed exponential. Assuming that is continuous and
then , for some fixed and is a convex function.
Proof.
We know that, if and , then is a convex function. We have
by the fact that is an increasing function . Hence, we have if and only if
which follows from Equation (37). ☐
Proposition 9.
Let such that . Assuming that is continuous and
for some fixed and . Then, the φ-family of probability is convex.
Proof.
Note that, for any , . Suppose , and consider for any . We show that by proving that for . In the others words, we will show that and p are -connected for all .
For , due the convexity of , we have
If , according the convexity of and , we have
since , we have by Corollary 1 that q and p are -connected. Hence,
so and p are -connected by an open arc, for all .
Now, if , the Lemma 4, is a convex function, so
where and k a constant. Taking , we have
since and, therefore, p and q are -connected by an open arc. ☐
4.2. Generalized Open Mixture Arcs
In Section 3.2, we proved that the generalized mixture arc given by
is well defined for . In this section, our goal is twofold: firstly, to ensure that the open arc is also well defined; and, secondly, to provide some properties of these arcs. For such objectives, we use Equation (34), which establishes that is an open set, so we can extend the convex combination in Equation (40) between and beyond these extreme points while maintaining positivity of . Indeed, by the fact is an open set, so there exists such that is the open ball of radius centered at with . Similarly, there exists such that . Taking we guarantee that the combination in (40) can be extended to .
Definition 2.
For a fixed deformed exponential φ, we say that p and q in are φ-connected by an open mixture arc if there exists an open interval such that
belongs to for every , where .
In [22], it was shown that densities connected by open mixture arcs have bounded away from zero ratios. Santacroce et al. [23] showed the converse implication, providing a characterization of open mixture models. Here, one can see that the fundamental role for being connected by open mixture arcs is given by ratios which have to be bounded. The functional in the definition of generalized open mixture arc satisfies . Thus the combination in (41) has to satisfy the same property, that is, . Assume that p and q are -connected by an open mixture arc given according to (41) belong to for all with . Since and , then
which implies that
and
which give to us
Conversely, if we have Equation (44), then and Equation (41) belongs to . Thus, we have that p and q in are -connected by an open mixture arc if and only if the ratio is bounded. By the fact that is an open set, there exists an interval such that belongs to and we have
for all . Then, there exist functions such that
with
that is, the convex combination in Equation (41) is also a functional of the type in Equation (12) for all . Then, the open mixture arc is well-defined. Another property of this connection by generalized open mixture arc is that it is an equivalence relation.
Proposition 10.
The relation in Definition 2 is an equivalence relation.
Proof.
Reflexity and symmetry properties follow from definition. As for the transitivity, consider such that and with for some . We can take and , and define a probability distribution
If we have and , we may define a probability distribution as
The generalized open mixture arc, , , connects and . ☐
5. Conclusions
In this work, we have generalized open exponential arc and open mixture arc for probability distributions. Moreover, we ensure that the generalization of open mixture arc is well-defined for deformed exponential strictly convex. From two -connected probability distributions and , we can define the generalized parallel transport between the tangent spaces given by
where with . A next step is to find a generalized parallel transport that is dual to . Another goal is to investigate if the generalized Rényi divergence defined in [36] from two probability distributions -connected, can be related to the statistical divergence associated with .
Acknowledgments
The authors would like to thank CAPES and CNPq (Procs. 309055/2014-8 and 408609/2016-8) for partial funding of this research.
Author Contributions
All authors contributed equally to the design of the research. The research was carried out by all authors. Rui F. Vigelis and Charles C. Cavalcante gave the central idea of the paper and managed the organization of it. Luiza H.F. de Andrade wrote the paper. All the authors read and approved the final manuscript.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Amari, S.-I. Information geometry on hierarchy of probability distributions. IEEE Trans. Inf. Theory 2001, 47, 1701–1711. [Google Scholar] [CrossRef]
- Calin, O.; Udrişte, C. Geometric Modeling in Probability and Statistics; Springer: Cham, Switzerland, 2014. [Google Scholar]
- Amari, S.-I. Information geometry and its applications. In Applied Mathematical Sciences; Springer: Tokyo, Japan, 2016; Volume 194. [Google Scholar]
- Amari, S.-I. Differential Geometry of Curved Exponential Families-Curvatures and Information Loss. Ann. Stat. 1982, 10, 357–385. [Google Scholar] [CrossRef]
- Amari, S.-I. Differential-Geometrical Methods in Statistics; Lecture Notes in Statistics; Springer: New York, NY, USA, 1985; Volume 28. [Google Scholar]
- Amari, S.-I.; Nagaoka, H. Methods of information geometry. In Translations of Mathematical Monographs; American Mathematical Society, Providence, RI; Translated from the 1993 Japanese original by Daishi Harada; Oxford University Press: Oxford, UK, 2000; Volume 191. [Google Scholar]
- Amari, S.-I.; Cichocki, A. Information geometry of divergence functions. Bull. Pol. Acad. Sci. Tech. Sci. 2010, 58, 183–195. [Google Scholar]
- Amari, S.-I. α-Divergence Is Unique, Belonging to Both f-Divergence and Bregman Divergence Classes. IEEE Trans. Inf. Theory 2009, 55, 4925–4931. [Google Scholar] [CrossRef]
- Zhang, J. Divergence Function, Duality, and Convex Analysis. Neural Comput. 2004, 16, 159–195. [Google Scholar] [CrossRef] [PubMed]
- Nielsen, F.; Nock, R. On w-mixtures: Finite convex combinations of prescribed component distributions. arXiv, 2017; arXiv:1708.00568v1. [Google Scholar]
- Amari, S.-I.; Ohara, A.; Matsuzoe, H. Geometry of deformed exponential families: Invariant, dually-flat and conformal geometries. Phys. A Stat. Mech. Appl. 2012, 391, 4308–4319. [Google Scholar] [CrossRef]
- Harsha, K.V.; Moosath, K.S.S. Dually flat geometries of the deformed exponential family. Phys. A Stat. Mech. Appl. 2015, 433, 136–147. [Google Scholar]
- Matsuzoe, H. Hessian structures on deformed exponential families and their conformal structures. Differ. Geom. Appl. 2014, 35, 323–333. [Google Scholar] [CrossRef]
- Matsuzoe, H.; Wada, T. Deformed Algebras and Generalizations of Independence on Deformed Exponential Families. Entropy 2015, 17, 5729–5751. [Google Scholar] [CrossRef]
- Giné, E.; Nickl, R. Mathematical Foundations of Infinite-Dimensional Statistical Models; Cambridge Series in Statistical and Probabilistic Mathematics; Cambridge University Press: Cambridge, UK, 2015. [Google Scholar]
- Townsend, J.; Solomon, B.; Smith, J. The perfect gestalt: Infinite dimensional Riemannian face spaces and other aspects of face perception. In Computacional, Geometric and Process Perspectives on Facial Cognition: Contexs and Challenges; Wenger, M.J., Townsend, J.T., Eds.; Society for Mathematical Psychology: Washington, DC, USA, 2001; pp. 39–82. [Google Scholar]
- Trivellato, B. Deformed exponentials and applications to finance. Entropy 2013, 15, 3471–3489. [Google Scholar] [CrossRef]
- Pistone, G.; Sempi, C. An infinite-dimensional geometric structure on the space of all the probability measures equivalent to a given one. Ann. Stat. 1995, 23, 1543–1561. [Google Scholar] [CrossRef]
- Pistone, G.; Rogantin, M.P. The exponential statistical manifold: Mean parameters, orthogonality and space transformations. Bernoulli 1999, 5, 721–760. [Google Scholar] [CrossRef]
- Gibilisco, P.; Pistone, G. Connections on Non-Parametric Statistical Manifolds by Orlicz Space Geometry. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 1998, 1, 325–347. [Google Scholar] [CrossRef]
- Grasselli, M.R. Dual connections in nonparametric classical information geometry. Ann. Inst. Statist. Math. 2010, 62, 873–896. [Google Scholar] [CrossRef]
- Cena, A.; Pistone, G. Exponential statistical manifold. Ann. Inst. Statist. Math. 2007, 59, 27–56. [Google Scholar] [CrossRef]
- Santacroce, M.; Siri, P.; Trivellato, B. New results on mixture and exponential models by Orlicz spaces. Bernoulli 2016, 22, 1431–1447. [Google Scholar] [CrossRef]
- Santacroce, M.; Siri, P.; Trivellato, B. On Mixture and Exponential Connection by Open Arcs. In Geometric Science of Information; Nielsen, F., Barbaresco, F., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 577–584. [Google Scholar]
- Pistone, G. Examples of the application of nonparametric information geometry to statistical physics. Entropy 2013, 15, 4042–4065. [Google Scholar] [CrossRef]
- Pistone, G. kappa-exponential models from the geometrical viewpoint. Eur. Phys. J. B 2009, 70, 29–37. [Google Scholar] [CrossRef]
- Kaniadakis, G. Non-linear kinetics underlying generalized statistics. Phys. A Stat. Mech. Appl. 2001, 296, 405–425. [Google Scholar] [CrossRef]
- Vigelis, R.F.; Cavalcante, C.C. On ϕ-families of probability distributions. J. Theoret. Probab. 2013, 26, 870–884. [Google Scholar] [CrossRef]
- Pistone, G. Nonparametric information geometry. In Geometric Science of Information; Lecture Notes in Comput. Sci.; Springer: Heidelberg, Germany, 2013; Volume 8085, pp. 5–36. [Google Scholar]
- Naudts, J. Generalised Thermostatistics; Springer-London, Ltd.: London, UK, 2011. [Google Scholar]
- Vigelis, R.F.; de Andrade, L.H.F.; Cavalcante, C.C. On the Existence of Paths Connecting Probability Distributions. In Proceedings of the Geometric Science of Information: Third International Conference, GSI 2017, Paris, France, 7–9 November 2017; Nielsen, F., Barbaresco, F., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 801–808. [Google Scholar]
- Musielak, J. Orlicz Spaces and Modular Spaces; Lecture Notes in Mathematics; Springer: Berlin, Germany, 1983; Volume 1034. [Google Scholar]
- Vigelis, R.F.; Cavalcante, C.C. The Δ2-condition and ϕ-families of probability distributions. In Geometric Science of Information; Lecture Notes in Comput. Sci.; Springer: Heidelberg, Germany, 2013; Volume 8085, pp. 729–736. [Google Scholar]
- Hudzik, H.; Zbaszyniak, Z. Smoothness in Musielak-Orlicz spaces equipped with the Orlicz norm. Collect. Math. 1997, 48, 543–561. [Google Scholar]
- Vigelis, R.F.; Cavalcante, C.C. Smoothness of the Orlicz norm in Musielak-Orlicz function spaces. Math. Nachr. 2014, 287, 1025–1041. [Google Scholar] [CrossRef][Green Version]
- de Souza, D.C.; Vigelis, R.F.; Cavalcante, C.C. Geometry Induced by a Generalization of Rényi Divergence. Entropy 2016, 18, 407. [Google Scholar] [CrossRef]
- De Andrade, L.H.F.; Vigelis, R.F.; Vieira, F.L.J.; Cavalcante, C.C. Normalization and ϕ-function: Definition and Consequences. In Proceedings of the Geometric Science of Information: Third International Conference, GSI 2017, Paris, France, 7–9 November 2017; Nielsen, F., Barbaresco, F., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 231–238. [Google Scholar]
- Asplund, E.; Rockafellar, R.T. Gradients of convex functions. Trans. Am. Math. Soc. 1969, 139, 443–467. [Google Scholar] [CrossRef]
- Barbu, V.; Precupanu, T. Convexity and Optimization in Banach Spaces, 4th ed.; Springer Monographs in Mathematics; Springer: Dordrecht, The Netherlands, 2012. [Google Scholar]
- Brø ndsted, A.; Rockafellar, R.T. On the subdifferentiability of convex functions. Proc. Am. Math. Soc. 1965, 16, 605–611. [Google Scholar] [CrossRef]
- Bauschke, H.H.; Borwein, J.M.; Combettes, P.L. Essential smoothness, essential strict convexity, and Legendre functions in Banach spaces. Commun. Contemp. Math. 2001, 3, 615–647. [Google Scholar] [CrossRef]
- Borwein, J.M.; Vanderwerff, J.D. Convex functions: Constructions, characterizations and counterexamples. In Encyclopedia of Mathematics and Its Applications; Cambridge University Press: Cambridge, UK, 2010; Volume 109. [Google Scholar]
- Vigelis, R.F. On Musielak-Orlicz Spaces and Applications to Information Geometry. Ph.D. Thesis, Department of Teleinformatics Engineering, Federal University of Ceará, Fortaleza, Brazil, 2011. [Google Scholar]
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).