Next Article in Journal
Performance Degradation Assessment of Rolling Element Bearings Based on an Index Combining SVD and Information Exergy
Previous Article in Journal
Research and Development of a Chaotic Signal Synchronization Error Dynamics-Based Ball Bearing Fault Diagnostor
Open AccessArticle

On Some Properties of Tsallis Hypoentropies and Hypodivergences

1
Department of Information Science, College of Humanities and Sciences, Nihon University, 3-25-40, Sakurajyousui, Setagaya-ku, Tokyo, 156-8550, Japan
2
Faculty of Engineering Sciences, LUMINA - University of South-East Europe, ¸ Sos. Colentina 64b, Bucharest, RO-021187, Romania
3
Mathematisch-Geographische Fakultät, Katholische Universität Eichstätt-Ingolstadt, 85071 Eichstätt, Germany
*
Author to whom correspondence should be addressed.
Entropy 2014, 16(10), 5377-5399; https://doi.org/10.3390/e16105377
Received: 15 September 2014 / Accepted: 8 October 2014 / Published: 15 October 2014
(This article belongs to the Section Statistical Physics)

Abstract

Both the Kullback–Leibler and the Tsallis divergence have a strong limitation: if the value zero appears in probability distributions (p1, ··· , pn) and (q1, ··· , qn), it must appear in the same positions for the sake of significance. In order to avoid that limitation in the framework of Shannon statistics, Ferreri introduced in 1980 hypoentropy: “such conditions rarely occur in practice”. The aim of the present paper is to extend Ferreri’s hypoentropy to the Tsallis statistics. We introduce the Tsallis hypoentropy and the Tsallis hypodivergence and describe their mathematical behavior. Fundamental properties, like nonnegativity, monotonicity, the chain rule and subadditivity, are established.
Keywords: mathematical inequality; Tsallis entropy; Tsallis hypoentropy; Tsallishypodivergence; chain rule; subadditivity mathematical inequality; Tsallis entropy; Tsallis hypoentropy; Tsallishypodivergence; chain rule; subadditivity

1. Preliminaries

Throughout this paper, X, Y and Z denote discrete random variables taking on the values {x1, · · ·, x|X|}, {y1, · · ·, y|Y |} and {z1, · · ·, z|Z|}, respectively, where |A| denotes the number of the values of the discrete random variable A. We denote the discrete random variable following a uniform distribution by U. We set the probabilities as p(xi) ≡ Pr(X = xi), p(yj) ≡ Pr(Y = yj) and p(zk) ≡ Pr(Z = zk). If |U| = n, then p ( u k ) = 1 n for all k = 1, · · ·, n. In addition, we denote by p(xi, yj) = Pr(X = xi, Y = yj), p(xi, yj, zk) = Pr(X = xi, Y = yj, Z = zk) the joint probabilities, by p(xi|yj) = Pr(X = xi|Y = yj), p(xi|yj, zk) = Pr(X = xi|Y = yj, Z = zk) the conditional probabilities, and so on.
The notion of entropy was used in statistical thermodynamics by Boltzmann [1] in 1871 and Gibbs [2] in 1902, in order to quantify the diversity, uncertainty and randomness of isolated systems. Later, it was seen as a measure of “information, choice and uncertainty” in the theory of communication, when Shannon [3] defined it by:
H ( X ) - i = 1 X p ( x i ) log  p ( x i ) .
In what follows, we consider |X| = |Y| = |U| = n, unless otherwise specified.
Making use of the concavity of the logarithmic function, one can easily check that the equiprobable states are maximizing the entropy, that is:
H ( X ) H ( U ) = log  n .
The right-hand side term of this inequality has been known since 1928 as Hartley entropy [4].
For two random variables X and Y following distributions {p(xi)} and {p(yi)}, the Kullback–Leibler [5] discrimination function (divergence or relative entropy) is defined by:
D ( X Y ) i = 1 n p ( x i ) ( log  p ( x i ) - log  p ( y i ) ) = - i = 1 n p ( x i ) log p ( y i ) p ( x i ) .
(We note that the relative entropy is usually defined for two probability distributions P = {pi} and Q = {qi} as D ( P Q ) - i = 1 n p i log q i p i in the standard notation of information theory. D(P||Q) is often rewritten by D(X||Y ) for random variables X and Y following the distributions P and Q. Throughout this paper, we use the style of Equation (3) for relative entropies to unify the notation with simple descriptions.) Here, the conventions a · log 0 a = - ( a > 0 ) and 0 · log b 0 = 0 ( b 0 ) are used. (We also note that the convention is often given in the following way with the definition of D(X||Y ). If there exists i, such that p(xi) ≠ 0 = p(yi), then we define D(X||Y ) + (in this case, D(X||Y ) is no longer significant as an information measure). Otherwise, D(X||Y ) is defined by Equation (3) with the convention 0 · log 0 0 = 0. This fact has been mentioned in the abstract of the paper.) In what follows, we use such conventions in the definitions of the entropies and divergences. However, we do not state them repeatedly.
It holds that:
H ( U ) - H ( X ) = D ( X U ) .
Moreover, the cross-entropy (or inaccuracy):
H ( c r o s s ) ( X , Y ) - i = 1 n p ( x i ) log  p ( y i )
satisfies the identity:
D ( X Y ) = H ( c r o s s ) ( X , Y ) - H ( X ) .
Many extensions of Shannon entropy have been studied. The Rényi entropy [6] and α-entropy [7] are famous. The mathematical results until the 1970s are well written in the book [8]. In the present paper, we focus on the hypoentropy introduced by Carlo Ferreri and the Tsallis entropy introduced by Constantino Tsallis.
The hypoentropy at the level λ (λ-entropy) was introduced in 1980 by Ferreri [9] as an alternative measure of information in the following form:
F λ ( X ) 1 λ ( λ + 1 ) log ( λ + 1 ) - 1 λ i = 1 n ( 1 + λ p ( x i ) ) log ( 1 + λ p ( x i ) )
for λ > 0. According to Ferreri [9], the parameter λ can be interpreted as a measure of the information inaccuracy of economic forecast. As we will show that Fλ(X) ≤ H(X) in the second section, the name hypoentropy comes from this property.
On the other hand, Tsallis introduced a one-parameter extension of the entropy in 1988 in [10], for handling systems that appear to deviate from standard statistical distributions. It plays an important role in the nonextensive statistical mechanics of complex systems, being defined as:
T q ( X ) - i = 1 n p ( x i ) q ln q p ( x i ) = i = 1 n p ( x i ) ln q 1 p ( x i )             ( q 0 , q 1 ) .
Here, the q–logarithmic function for x > 0 is defined by ln q ( x ) x 1 - q - 1 1 - q, which converges to the usual logarithmic function log(x) in the limit q → 1. The Tsallis divergence (relative entropy) [11] is given by:
S q ( X Y ) i = 1 n p ( x i ) q ( ln q p ( x i ) - ln q p ( y i ) ) = - i = 1 n p ( x i ) ln q p ( y i ) p ( x i ) .
Note that some important properties of the Tsallis relative entropy were given in the papers [1214].

2. Hypoentropy and Hypodivergence

For nonnegative real numbers, ai and bi (i = 1, · · ·, n), we define the generalized relative entropy (for incomplete probability distributions):
D ( g e n ) ( a 1 , , a n b 1 , , b n ) i = 1 n a i log a i b i .
Then, we have the so-called “log-sum” inequality:
i = 1 n a i log a i b i ( i = 1 n a i ) log i = 1 n a i i = 1 n b i ,
with equality if and only if a i b i = c o n s t. for all i = 1, · · ·, n.
If we impose the condition:
i = 1 n a i = i = 1 n b i = 1 ,
then D(gen)(a1, · · ·, an||b1, · · ·, bn) is just the relative entropy,
D ( a 1 , , a n b 1 , , b n ) i = 1 n a i log a i b i .
We put a i = 1 λ + p ( x i ) and b i = 1 λ + p ( y i ) with λ > 0 and i = 1 n p ( x i ) = i = 1 n p ( y i ) = 1, p(xi) 0, p(yi) 0. Then, we find that it is equal to the hypodivergence (λ-divergence) introduced by Ferreri in [9],
K λ ( X Y ) 1 λ i = 1 n ( 1 + λ p ( x i ) ) log 1 + λ p ( x i ) 1 + λ p ( y i ) .
Clearly, we have:
lim λ K λ ( X Y ) = D ( X Y ) .
Using the “log-sum” inequality, we have the nonnegativity:
K λ ( X Y ) 0 ,
with equality if and only if p(xi) = p(yi) for all i = 1, · · ·, n.
For the hypoentropy Fλ(X) defined in Equation (7), we firstly show the fundamental relations. To do so, we prepare with the following lemma.

Lemma 1

For any a > 0 and 0 ≤ x ≤ 1, we have
x ( 1 + a ) log ( 1 + a ) ( 1 + a x ) log ( 1 + a x ) .

Proof

We set f(x) ≡ x(1 + a) log(1 + a) (1 + ax) log(1 + ax). For any a > 0, we then have d 2 f ( x ) d x 2 = - a 2 1 + a x < 0 and f(0) = f(1) = 0. Thus, we have the inequality.

Proposition 1

For λ > 0, we have the following inequalities:
0 F λ ( X ) F λ ( U ) .
The equality in the first inequality holds if and only if p(xj) = 1 for some j (then p(xi) = 0 for all i ≠ j). The equality in the second inequality holds if and only if p(xi) = 1/n for all i = 1, · · ·, n.

Proof

From the nonnegativity of the hypodivergence Equation (15), we get:
0 K λ ( X U )
= 1 λ i = 1 n ( 1 + λ p ( x i ) ) log ( 1 + λ p ( x i ) ) - 1 λ ( n + λ ) log  ( 1 + λ n ) .
Thus, we have:
- 1 λ i = 1 n ( 1 + λ p ( x i ) ) log ( 1 + λ p ( x i ) ) - 1 λ ( n + λ ) log  ( 1 + λ n ) .
Adding 1 λ ( λ + 1 ) log ( λ + 1 ) to both sides, we have:
F λ ( X ) F λ ( U ) ,
with equality if and only if p(xi) = 1/n for all i = 1, · · ·, n.
For the first inequality, it is sufficient to prove:
( 1 + λ ) log ( 1 + λ ) - i = 1 n ( 1 + λ p ( x i ) ) log ( 1 + λ p ( x i ) ) 0.
Since i = 1 n p ( x i ) = 1, the above inequality is written as:
i = 1 n { p ( x i ) ( 1 + λ ) log ( 1 + λ ) - ( 1 + λ p ( x i ) ) log ( 1 + λ p ( x i ) ) } 0 ,
so that we have only to prove:
p ( x i ) ( 1 + λ ) log ( 1 + λ ) - ( 1 + λ p ( x i ) ) log ( 1 + λ p ( x i ) ) 0 ,
for any λ > 0 and 0 ≤ p(xi) 1. Lemma 1 shows this inequality and the equality condition.
It is a known fact [9] that Fλ(X) is monotonically increasing as a function of λ and:
lim λ F λ ( X ) = H ( X ) ,
whence its name, as we noted in the Introduction. Thus, the hypoentropy appears as a generalization of Shannon’s entropy. One can see that the hypoentropy also equals zero as the entropy does, in the case of certainty (i.e., for a so-called pure state when all probabilities vanish, but one).
It also holds that:
F λ ( U ) - F λ ( X ) = K λ ( X U ) .
It is of some interest for the reader to look at the hypoentropy that arises for equiprobable states,
F λ ( U ) = ( 1 + 1 λ ) log  ( 1 + λ ) - ( 1 + n λ ) log  ( 1 + λ n ) .
Seen as a function of two variables, n and λ, it increases in each variable [9]. Since:
lim λ F λ ( U ) = log  n ,
We shall call it Hartley hypoentropy. (Throughout the paper, we add the name Hartley to the name of mathematical objects whenever they are considered for the uniform distribution. In the same way, we proceed with the name Tsallis, which we add to the name of some mathematical objects that we define, to emphasize that they are used in the framework of Tsallis statistics. This means that we will have Tsallis hypoentropies, Tsallis hypodivergences, and so on.) We have the cross-hypoentropy:
F λ ( c r o s s ) ( X , Y ) ( 1 + 1 λ ) log  ( 1 + λ ) - 1 λ i = 1 n ( 1 + λ p ( x i ) ) log ( 1 + λ p ( y i ) ) .
It holds:
K λ ( X Y ) = F λ ( c r o s s ) ( X , Y ) - F λ ( X ) 0 ,
therefore, we have F λ ( c r o s s ) ( X , Y ) F λ ( X ).
We can show an upper bound for Fλ(X) as a direct consequence.

Proposition 2

The following inequality holds.
F λ ( X ) ( 1 - p m a x ) log  ( 1 + λ ) ,
for all λ > 0, where pmax max {p(x1), · · ·, p(xn)}.

Proof

In the inequality (30), if for a fixed k, one takes the probability of the k-th component of Y to be p(yk) = 1, then:
- i = 1 n ( 1 + λ p ( x i ) ) log  ( 1 + λ p ( x i ) ) - ( 1 + λ p ( x k ) ) log  ( 1 + λ ) .
This implies that:
F λ ( X ) ( 1 + 1 λ ) log  ( 1 + λ ) - 1 λ ( 1 + λ p ( x k ) ) log ( 1 + λ )
= ( 1 - p ( x k ) ) log ( 1 + λ ) .
Since k is arbitrarily fixed, the conclusion follows.

Remark 1

It is of interest to notice now that, for the particular case X = U, we have:
F λ ( U ) ( 1 - 1 n ) log ( 1 + λ ) .
We add here one more detail: the inequality (34) can be verified using Bernoulli’s inequality.

3. Tsallis Hypoentropy and Hypodivergence

Now, we turn our attention to the Tsallis statistics. We extend the definition of hypodivergences as follows:

Definition 1

The Tsallis hypodivergence (q-hypodivergence, Tsallis relative hypoentropy) is defined by:
D λ , q ( X Y ) - 1 λ i = 1 n ( 1 + λ p ( x i ) ) ln q 1 + λ p ( y i ) 1 + λ p ( x i )
for λ > 0 and q ≥ 0.
Then, we have the relation:
lim λ D λ , q ( X Y ) = S q ( X Y )
which is the Tsallis divergence, and:
lim q 1 D λ , q ( X Y ) = K λ ( X Y )
which is the hypodivergence.

Remark 2

This definition can be also obtained from the generalized Tsallis relative entropy (for incomplete probability distributions {a1, · · ·, an} and {b1, · · ·, bn}):
D q ( g e n ) ( a 1 , , a n b 1 , , b n ) - i = 1 n a i ln q b i a i ,
by putting a i = 1 λ + p ( x i ) and b i = 1 λ + p ( y i ) for λ > 0.
The generalized relative entropy (10) and the generalized Tsallis relative entropy (38) can be written as the generalized f-divergence (for incomplete probability distributions):
D f ( g e n ) ( a 1 , , a n b 1 , , b n ) i = 1 n a i f ( b i a i )
for a convex function f on (0, ) and ai 0, bi 0 (i = 1, · · ·, n). See [15] and [16] for details.
By the concavity of the q-logarithmic function, we have the following “lnq-sum” inequality:
- i = 1 n a i ln q b i a i - ( i = 1 n a i ) ln q ( i = 1 n b i i = 1 n a i ) ,
with equality if and only if a i b i = c o n s t for all i = 1, · · ·, n. Using the “lnq-sum” inequality, we have the nonnegativity of the Tsallis hypodivergence:
D λ , q ( X Y ) 0 ,
with equality if and only if p(xi) = p(yi) for all i = 1, · · ·, n (the equality condition comes from the equality condition of the “lnq-sum” inequality and the condition i = 1 n p ( x i ) = i = 1 n p ( y i ) = 1.

Definition 2

For λ > 0 and q ≥ 0, the Tsallis hypoentropy (q-hypoentropy) is defined by:
H λ , q ( X ) h ( λ , q ) λ { - ( 1 + λ ) ln q 1 1 + λ + i = 1 n ( 1 + λ p ( x i ) ) ln q 1 1 + λ p ( x i ) }
where the function h(λ, q) > 0 satisfies two conditions,
lim q 1 h ( λ , q ) = 1
and:
lim λ h ( λ , q ) λ 1 - q = 1.
These conditions are equivalent to:
lim q 1 H λ , q ( X ) = F λ ( X ) = Hypoentropy
and, respectively,
lim λ H λ , q ( X ) = T q ( X ) = Tsallis entropy .
Some interesting examples are h(λ, q) = λ1−q and h(λ, q) = (1+λ)1−q.

Remark 3

It may be remarkable to discuss the Tsallis cross-hypoentropy. The first candidate for the definition of the Tsallis cross-hypoentropy is:
H λ , q ( c r o s s ) ( X , Y ) h ( λ , q ) λ { - ( 1 + λ ) ln q 1 1 + λ - i = 1 n ( 1 + λ p ( x i ) ) q ln q ( 1 + λ p ( y i ) ) }
which recovers the cross-hypoentropy defined in Equation (29) in the limit q → 1. Then, we have:
H λ , q ( c r o s s ) ( X , Y ) - H λ , q ( X ) = h ( λ , q ) λ i = 1 n ( 1 + λ p ( x i ) ) q { ln q ( 1 + λ p ( x i ) ) - ln q ( 1 + λ p ( y i ) ) } = - h ( λ , q ) λ i = 1 n ( 1 + λ p ( x i ) ) ln q 1 + λ p ( y i ) 1 + λ p ( x i ) = h ( λ , q ) D λ , q ( X Y ) 0.
The last inequality is due to the nonnegativity given in Equation (41). Since limq→1 h(λ, q) = 1 by the definition of the Tsallis hypoentropy (see Equation (43)), the above relation recovers the inequality (30) in the limit q → 1.
The second candidate for the definition of the Tsallis cross-hypoentropy is:
H ˜ λ , q ( c r o s s ) ( X , Y ) h ( λ , q ) λ { - ( 1 + λ ) ln q 1 1 + λ + i = 1 n ( 1 + λ p ( x i ) ) ln q 1 1 + λ p ( y i ) } ,
which also recovers the cross-hypoentropy defined in Equation (29) in the limit q → 1. Then, we have:
H ˜ λ , q ( c r o s s ) ( X , Y ) - H λ , q ( X ) = - h ( λ , q ) λ i = 1 n ( 1 + λ p ( x i ) ) { ln q 1 1 + λ p ( x i ) - ln q 1 1 + λ p ( y i ) } = h ( λ , q ) D ˜ λ , q ( X Y ) ,
where the alternative Tsallis hypodivergence has to be defined by:
D ˜ λ , q ( X Y ) - 1 λ i = 1 n ( 1 + λ p ( x i ) ) { ln q 1 1 + λ p ( x i ) - ln q 1 1 + λ p ( y i ) } .
We have D̃λ,q(X||Y ) ≠ Dλ,q(X||Y ) and limq→1 λ,q(X||Y) = Kλ(X||Y ). However, the nonnegativity of D̃λ,q(X||Y ) (q ≥ 0) does not hold in general, as the following counter-examples show. Take λ = 1, n = 2, p(x1) = 0.9, p(y1) = 0.8, q = 0.5, then D̃λ,q(X||Y ) ≃ −0.0137586. In addition, take λ = 1, n = 3, p(x1) = 0.3, p(x2) = 0.4, p(y1) = 0.2, p(y2) = 0.7 and q = 1.9, then D̃λ,q(X||Y ) ≃ −0.0195899. Therefore, we may conclude that Equation (47) is to be given the preference over Equation (48).
We turn to show the nonnegativity and maximality for the Tsallis hypoentropy.

Lemma 2

For any a > 0, q ≥ 0 and 0 ≤ x ≤ 1, we have:
x ( 1 + a ) ln q 1 1 + a ( 1 + a x ) ln q 1 1 + a x .

Proof

We set g ( x ) x ( 1 + a ) ln q 1 1 + a - ( 1 + a x ) ln q 1 1 + a x. For any a > 0 and q ≥ 0, we then have d 2 g ( x ) d x 2 = q a 2 ( 1 1 + a x ) 2 - q 0 and g(0) = g(1) = 0. Thus, we have the inequality.

Proposition 3

For λ > 0, q ≥ 0 and h(λ, q) > 0 satisfying (43) and (44), we have the following inequalities:
0 H λ , q ( X ) H λ , q ( U ) .
The equality in the first inequality holds if and only if p(xj) = 1 for some j (then p(xi) = 0 for all ij). The equality in the second inequality holds if and only if p(xi) = 1/n for all i = 1, · · ·, n.

Proof

In a similar way to the proof of Proposition 1, for the first inequality, it is sufficient to prove:
- i = 1 n { p ( x i ) ( 1 + λ ) ln q 1 1 + λ - ( 1 + λ p ( x i ) ) ln q 1 1 + λ p ( x i ) } 0 ,
so that we have only to prove:
p ( x i ) ( 1 + λ ) ln q 1 1 + λ ( 1 + λ p ( x i ) ) ln q 1 1 + λ p ( x i )
for any λ > 0, q ≥ 0 and 0 ≤ p(xi) 1. Lemma 2 shows this inequality with the equality condition.
The second inequality is proven by the use of the nonnegativity of the Tsallis hypodivergence in the following way:
0 D λ , q ( X U ) = - 1 λ i = 1 n ( 1 + λ p ( x i ) ) ln q 1 + λ n 1 + λ p ( x i )
which implies (by the use of the formula ln q b a = b 1 - q ln q 1 a + ln q b:
1 λ i = 1 n ( 1 + λ p ( x i ) ) ln q 1 1 + λ p ( x i ) n + λ λ ln q n n + λ .
The equality condition of the second inequality follows from the equality condition of the nonnegativity of the Tsallis hypodivergence (41).
We may call:
H λ , q ( U ) = h ( λ , q ) λ { - ( 1 + λ ) ln q 1 1 + λ + ( n + λ ) ln q 1 1 + λ n }
the Hartley–Tsallis hypoentropy. We study the monotonicity for n or λ of the Hartley–Tsallis hypoentropy Hλ,q(U) and the Tsallis hypoentropy Hλ,q(X). (Throughout the present paper, the term “monotonicity” means the monotone increasing/decreasing as a function of the parameter λ. We emphasize that it does not mean the monotonicity for some stochastic maps.)

Lemma 3

The function:
f ( x ) = ( x + 1 ) ln q x x + 1             ( x > 0 )
is monotonically increasing in x, for any q ≥ 0.

Proof

By direct calculations, we have:
d f ( x ) d x = 1 1 - q { ( 1 + 1 x ) q - 1 ( 1 + 1 - q x ) - 1 }
and:
d 2 f ( x ) d x 2 = - q x - 3 ( 1 + 1 x ) q - 2 0.
Since lim x d f ( x ) d x = 0, we have d f ( x ) d x 0.

Proposition 4

The Hartley–Tsallis hypoentropy:
H λ , q ( U ) = h ( λ , q ) λ { - ( 1 + λ ) ln q 1 1 + λ + ( n + λ ) ln q 1 1 + λ n }
is a monotonically increasing function of n, for any λ > 0 and q ≥ 0.

Proof

Note that:
H λ , q ( U ) = h ( λ , q ) { - ( 1 + 1 λ ) ln q 1 1 + λ + ( 1 + n λ ) ln q 1 1 + λ n } .
Putting x = n λ 0 for λ > 0 fixed in Lemma 3, we get the function:
g ( n ) = ( 1 + n λ ) ln q 1 1 + λ n ,
which is a monotonically increasing function of n. Thus, we have the present proposition.

Remark 4

We have the relation:
lim n H λ , q ( U ) = h ( λ , q ) { - ( 1 + 1 λ ) ln q 1 1 + λ - 1 } .
We notice from the condition (44) that:
lim λ ( lim n H λ , q ( U ) ) = lim λ h ( λ , q ) λ 1 - q · λ 1 - q     { - 1 - ( 1 + 1 λ ) ln q 1 1 + λ } = 1 1 - q lim λ 1 + q λ - ( 1 + λ ) q λ q = { 0 ( q = 0 ) ( 0 < q < 1 ) 1 q - 1 ( q > 1 ) ,
and conclude that the result is independent of the choice of h(λ, q).
For the limit λ → 0, we consider two cases.
(1)
In the case of h(λ, q) = λ1−q, we have:
lim λ 0 ( lim n H λ , q ( U ) ) = lim λ 0 λ 1 - q     { - 1 - ( 1 + 1 λ ) ln q 1 1 + λ } = 1 1 - q lim λ 0 1 + q λ - ( 1 + λ ) q λ q = { ( q > 2 ) 1 ( q = 2 ) 0 ( 0 q < 2 ) ,
as one obtains using l’Hôpital’s rule.
(2)
In the case of h(λ, q) = (1 + λ)1−q, we have for all q ≥ 0:
lim λ 0 ( lim n H λ , q ( U ) ) = lim λ 0 ( 1 + λ ) 1 - q { - 1 - ( 1 + 1 λ ) ln q 1 1 + λ } = 1 1 - q lim λ 0 1 + q λ - ( 1 + λ ) q λ ( 1 + λ ) q - 1 = q 1 - q lim λ 0 1 - ( 1 + λ ) q - 1 ( 1 + λ ) q - 1 + ( q - 1 ) λ ( 1 + λ ) q - 2 = 0.
These results mean that our Hartley–Tsallis hypoentropy with h(λ, q) = λ1−q or (1 + λ)1−q has the same limits as the Hartley hypoentropy, Fλ(U) (see also [9]), in the case 0 < q < 1.
We study here the monotonicity of Hλ,q(X) for h(λ, q) = (1 + λ)1−q. The other case h(λ, q) = λ1−q is studied in the next section; see Lemma 5.

Proposition 5

We assume h(λ, q) = (1 + λ)1−q. Then, Hλ,q(X) is a monotone increasing function of λ > 0 when 0 ≤ q ≤ 2.

Proof

Note that:
H λ , q ( X ) = i = 1 n S n λ , q ( p ( x i ) ) ,
where:
S n λ , q ( x ) ( 1 + λ ) 1 - q λ ( 1 - q ) { ( 1 + λ x ) q - ( 1 + λ ) q x + x - 1 }
is defined on 0 ≤ x ≤ 1, 0 ≤ q ≤ 2 and λ > 0. Then, we have:
d H λ , q ( X ) d λ = i = 1 n d S n λ , q ( p ( x i ) ) d λ = i = 1 n s λ , q ( p ( x i ) ) ,
where:
s λ , q ( x ) q λ ( 1 - x ) { 1 - ( 1 + λ x ) q - 1 } + 1 - x + ( 1 + λ ) q x - ( 1 + λ x ) q ( 1 - q ) λ 2 ( 1 + λ ) q
is defined on 0 ≤ x ≤ 1, 0 ≤ q ≤ 2 and λ > 0. By some computations, we have:
d 2 s λ , q ( x ) d x 2 = - q ( 1 + λ x ) q - 3 [ 1 + λ { ( x - 1 ) ( q - 1 ) + 1 } ] ( 1 + λ ) q 0 ,
since (x − 1)(q − 1) + 1 ≥ 0 for 0 ≤ x ≤ 1 and 0 ≤ q ≤ 2. We easily find sλ,q(0) = sλ,q(1) = 0. Thus, we have sλ,q(x) ≥ 0 for 0 ≤ x ≤ 1, 0 ≤ q ≤ 2 and λ > 0. Therefore, we have d H λ , q ( X ) d λ 0 for 0 ≤ q ≤ 2 and λ > 0.
This result agrees with the known fact that the usual (Ferreri) hypoentropy is increasing as a function of λ.
Closing this subsection, we give a q-extended version for Proposition 2.

Proposition 6

Let pmax ≡ max{p(x1), ···, p(xn)}. Then, we have the following inequality.
H λ , q ( X ) h ( λ , q ) λ { ( 1 + λ ) q - ( 1 + λ p m a x ) q } ln q ( 1 + λ )
for all λ > 0 and q ≥ 0.

Proof

From the “lnq-sum” inequality, we have Dλ,q(X||Y ) ≥ 0. Since λ > 0, we have:
- i = 1 n ( 1 + λ p ( x i ) ) ln q 1 + λ p ( y i ) 1 + λ p ( x i ) 0
which is equivalent to:
i = 1 n ( 1 + λ p ( x i ) ) q { ln q ( 1 + λ p ( x i ) ) - ln q ( 1 + λ p ( y i ) ) } 0.
Thus, we have:
- i = 1 n ( 1 + λ p ( x i ) ) q ln q ( 1 + λ p ( x i ) ) - i = 1 n ( 1 + λ p ( x i ) ) q ln q ( 1 + λ p ( y i ) ) ,
which extends the result given from the inequality (30). For arbitrarily fixed k, we set p(yk) = 1 (and p(yi) = 0 for ik) in the above inequality; then, we have:
- i = 1 n ( 1 + λ p ( x i ) ) q ln q ( 1 + λ p ( x i ) ) - ( 1 + λ p ( x k ) ) q ln q ( 1 + λ ) .
Since x q ln q x = - x ln q 1 x, we have:
i = 1 n ( 1 + λ p ( x i ) ) ln q 1 1 + λ p ( x i ) - ( 1 + λ p ( x k ) ) q ln q ( 1 + λ ) .
Multiplying both sides by h ( λ , q ) λ > 0 and then adding
- h ( λ , q ) λ ( 1 + λ ) ln q 1 1 + λ = h ( λ , q ) λ ( 1 + λ ) q ln q ( 1 + λ )
to both sides, we have:
H λ , q ( X ) h ( λ , q ) λ { ( 1 + λ ) q - ( 1 + λ p ( x k ) ) q } ln q ( 1 + λ ) .
Since k is arbitrary, we have this proposition.
Letting q → 1 in the above proposition, we recover Proposition 2.

4. The Subadditivities of the Tsallis Hypoentropies

Throughout this section, we assume |X| = n, |Y | = m, |Z| = l. We define the joint Tsallis hypoentropy at the level λ by:
H λ , q ( X , Y ) h ( λ , q ) λ { - ( 1 + λ ) ln q 1 1 + λ + i = 1 n j = 1 m ( 1 + λ p ( x i , y i ) ) ln q 1 1 + λ p ( x i , y i ) } .
Note that Hλ,q(X, Y ) = Hλ,q(Y, X).
For all i = 1, ···, n for which p(xi) ≠ 0, we define the Tsallis hypoentropy of Y given X = xi, at the level λp(xi), by:
H λ p ( x i ) , q ( Y x i ) h ( λ p ( x i ) , q ) λ p ( x i ) { - ( 1 + λ p ( x i ) ) ln q 1 1 + λ p ( x i ) + j = 1 m ( 1 + λ p ( x i ) p ( y j x i ) ) ln q 1 1 + λ p ( x i ) p ( y j x i ) } = h ( λ p ( x i ) , q ) λ p ( x i ) { - ( 1 + λ p ( x i ) ) ln q 1 1 + λ p ( x i ) + j = 1 m ( 1 + λ p ( x i , y j ) ) ln q 1 1 + λ p ( x i , y j ) } .
For n = 1, this coincides with the hypoentropy Hλ,q(Y ). As for the particular case m = 1, we get Hλp(xi),q(Y |xi) = 0.

Definition 3

The Tsallis conditional hypoentropy at the levelλ is defined by:
H λ , q ( Y X ) i = 1 n p ( x i ) q H λ p ( x i ) , q ( Y x i ) .
(As a usual convention, the corresponding summand is defined as zero, if p(xi) = 0.)
Throughout this section, we consider the particular function h(λ, q) = λ1−q for λ > 0, q ≥ 0.

Lemma 4

We assume h(λ, q) = λ1−q. The chain rule for the Tsallis hypoentropy holds:
H λ , q ( X , Y ) = H λ , q ( X ) + H λ , q ( Y X ) .

Proof

The proof is done by straightforward computation as follows.
H λ , q ( X ) + H λ , q ( Y X ) = λ 1 - q λ { - ( 1 + λ ) ln q 1 1 + λ + i = 1 n ( 1 + λ p ( x i ) ) ln q 1 1 + λ p ( x i ) } + i = 1 n ( λ p ( x i ) ) 1 - q λ p ( x i ) p ( x i ) q { - ( 1 + λ p ( x i ) ) ln q 1 1 + λ p ( x i ) + j = 1 m ( 1 + λ p ( x i , y j ) ) ln q 1 1 + λ p ( x i , y j ) } = λ 1 - q λ { - ( 1 + λ ) ln q 1 1 + λ + i = 1 n ( 1 + λ p ( x i ) ) ln 1 1 + λ p ( x i ) } + λ 1 - q λ { - i = 1 n ( 1 + λ p ( x i ) ) ln q 1 1 + λ p ( x i ) + i = 1 n j = 1 m ( 1 + λ p ( x i , y j ) ) ln q 1 1 + λ p ( x i , y j ) } = λ 1 - q λ { - ( 1 + λ ) ln q 1 1 + λ + i = 1 m j = 1 m ( 1 + λ p ( x i , y j ) ) ln q 1 1 + λ p ( x i , y j ) } = H λ , q ( X , Y ) .
In the limit λ → ∞, the identity (66) becomes Tq(X, Y ) = Tq(X) + Tq(Y |X), where T q ( Y X ) i = 1 n p ( x i ) q T q ( Y x i ) = - i = 1 n j = 1 m p ( x i , y j ) q ln q p ( y j x i ) is the Tsallis conditional entropy and T q ( X , Y ) i = 1 n j = 1 m p ( x i , y j ) ln q 1 p ( x i , y j ) is the Tsallis joint entropy (see also p. 3 in [17]).
In the limit q → 1 in Lemma 4, we also obtain the identity Fλ(X, Y ) = Fλ(X) + Fλ(Y |X), which naturally leads to the definition of Fλ(Y |X) as conditional hypoentropy.
In order to obtain the subadditivity for the Tsallis hypoentropy, we prove the monotonicity in λ of the Tsallis hypoentropy.

Lemma 5

We assume h(λ, q) = λ1−q. The Tsallis hypoentropy Hλ,q(X) is a monotonically increasing function of λ > 0 when 0 ≤ q ≤ 2 and a monotonically decreasing function of λ > 0 when q ≥ 2 (or q ≤ 0).

Proof

Note that:
H λ , q ( X ) = i = 1 n L n λ , q ( p ( x i ) ) ,
where:
L n λ , q ( x ) ( 1 + λ x ) q - ( 1 + λ ) q x + x - 1 λ q ( 1 - q )
is defined on 0 ≤ x ≤ 1 and λ > 0. Then, we have:
d H λ , q ( X ) d λ = i = 1 n d L n λ , q ( p ( x i ) ) d λ = i = 1 n l λ , q ( p ( x i ) ) ,
where:
l λ , q ( x ) q λ 2 ( 1 - q ) { ( 1 λ + 1 ) q - 1 x - ( 1 λ + x ) q + 1 - ( x - 1 ) λ q - 1 }
is defined on 0 ≤ x ≤ 1 and > > 0. By elementary computations, we obtain:
d 2 l λ , q ( x ) d x 2 = q ( q - 2 ) λ 1 - q ( 1 + λ x ) q - 3 .
Since we have lλ,q(0) = lλ,q(1) = 0, we find that lλ,q(x) ≥ 0 for 0 ≤ q ≤ 2 and any λ > 0. We also find that lλ,q(x) ≤ 0 for q ≥ 2 (or q ≤ 0) and any λ > 0. Therefore, we have d H λ , q ( X ) d λ 0 when 0 ≤ q ≤ 2, and d H λ , q ( X ) d λ 0 when q ≥ 2 (or q ≤ 0).
This result also agrees with the known fact that the usual (Ferreri) hypoentropy is increasing as a function of λ.

Theorem 1

We assume h(λ, q) = λ1−q. It holds Hλ,q(Y |X) ≤ Hλ,q(Y ) for 1 ≤ q ≤ 2.

Proof

We prove this theorem by the method used in [18] with Jensen’s inequality. We note that Lnλ,q(x) is a nonnegative and concave function in x, when 0 ≤ x ≤ 1, λ > 0 and q ≥ 0. Here, we use the notation for the conditional probability as p ( y j x i ) = p ( x i , y j ) p ( x i ) when p(xi) ≠ 0. By the concavity of Lnλ,q(x), we have:
i = 1 n p ( x i ) L n λ , q ( p ( y j x i ) ) L n λ , q ( i = 1 n p ( x i ) p ( y j x i ) )
= L n λ , q ( i = 1 n p ( x i , y j ) ) = L n λ , q ( p ( y j ) ) .
Summing both sides of the above inequality over j, we have:
i = 1 n p ( x i ) j = 1 m L n λ , q ( p ( y j x i ) ) j = 1 m L n λ , q ( p ( y j ) ) .
Since p(xi)qp(xi) for 1 ≤ q ≤ 2 and Lnλ,q(x) ≥ 0 for 0 ≤ x ≤ 1, λ > 0 and q ≥ 0, we have:
p ( x i ) q j = 1 m L n λ , q ( p ( y j x i ) ) p ( x i ) j = 1 m L n λ , q ( p ( y j x i ) ) .
Summing both sides of the above inequality over i, we have:
i = 1 n p ( x i ) q j = 1 m L n λ , q ( p ( y j x i ) ) i = 1 n p ( x i ) j = 1 m L n λ , q ( p ( y j x i ) ) .
By the two inequalities (74) and (76), we have:
i = 1 n p ( x i ) q j = 1 m L n λ , q ( p ( y j x i ) ) j = 1 m L n λ , q ( p ( y j ) ) .
Here, we can see that j = 1 m L n λ , q ( p ( y j x i ) ) is the Tsallis hypoentropy for fixed xi and the Tsallis hypoentropy is a monotonically increasing function of λ in the case 1 ≤ q ≤ 2, due to Lemma 5. Thus, we have:
j = 1 m L n λ p ( x i ) , q ( p ( y j x i ) ) j = 1 m L n λ , q ( p ( y j x i ) ) .
By the two inequalities (77) and (78), we finally have:
i = 1 n p ( x i ) q j = 1 m L n λ p ( x i ) , q ( p ( y j x i ) ) j = 1 m L n λ , q ( p ( y j ) ) ,
which implies (since p ( y j x i ) = p ( x i , y j ) p ( x i ):
i = 1 n p ( x i ) q H λ p ( x i ) , q ( Y x i ) j = 1 m L n λ , q ( p ( y j ) ) ,
since we have for all fixed xi,
H λ p ( x i ) , q ( Y x i ) = 1 λ q p ( x i ) q j = 1 m { - p ( y j x i ) ( 1 + λ p ( x i ) ) ln q 1 1 + λ p ( x i ) + ( 1 + λ p ( x i ) p ( y j x i ) ) ln q 1 1 + λ p ( x i ) p ( y j x i ) } = j = 1 m L n λ p ( x i ) , q ( p ( y j x i ) ) .
Therefore, we have Hλ,q(Y |X) ≤ Hλ,q(Y ).

Corollary 1

We have the following subadditivity for the Tsallis hypoentropies:
H λ , q ( X , Y ) H λ , q ( X ) + H λ , q ( Y )
in the case h(λ, q) = λ1−q for 1 ≤ q ≤ 2.

Proof

The proof is easily done by Lemma 4 and Theorem 1.
We are now in a position to prove the strong subadditivity for the Tsallis hypoentropies. The strong subadditivity for entropy is one of interesting subjects in entropy theory [19]. For this purpose, we firstly give a chain rule for three random variables X, Y and Z.

Lemma 6

We assume h(λ, q) = λ1−q. The following chain rule holds:
H λ , q ( X , Y , Z ) = H λ , q ( X Y , Z ) + H λ , q ( Y , Z ) .

Proof

The proof can be done following the recipe used in Lemma 4.
H λ , q ( X Y , Z ) + H λ , q ( Y , Z ) = j = 1 m k = 1 l p ( y j , z k ) q 1 ( λ p ( y j , z k ) ) q { - ( 1 + λ p ( y j , z k ) ) ln q 1 1 + λ p ( y j , z k ) + i = 1 n ( 1 + λ p ( y j , z k ) p ( x i , y j , z k ) p ( y j , z k ) ) ln q 1 1 + λ p ( y j , z k ) p ( x i , y j , z k ) p ( y j , z k ) } + 1 λ q { - ( 1 + λ ) ln q 1 1 + λ + j = 1 m k = 1 l ( 1 + λ p ( y j , z k ) ) ln q 1 1 + λ p ( y j , z k ) } = 1 λ q { - ( 1 + λ ) ln q 1 1 + λ + i = 1 n j = 1 m k = 1 l ( 1 + λ p ( x i , y j , z k ) ) ln q 1 1 + λ p ( x i , y j , z k ) } = H λ , q ( X , Y , Z ) .

Theorem 2

We assume h(λ, q) = λ1−q. The strong subadditivity for the Tsallis hypoentropies,
H λ , q ( X , Y , Z ) + H λ , q ( Z ) H λ , q ( X , Z ) + H λ , q ( Y , Z ) ,
holds for 1 ≤ q ≤ 2.

Proof

This theorem is proven in a similar way as Theorem 1. By the concavity of the function Lnλp(zk),q(x) in x and by using Jensen’s inequality, we have:
j = 1 m p ( y j z k ) L n λ p ( z k ) , q ( p ( x i y j , z k ) ) L n λ p ( z k ) , q ( j = 1 m p ( y j z k ) p ( x i y j , z k ) ) .
Multiplying both sides by p(zk)q and summing over i and k, we have:
j = 1 m k = 1 l p ( z k ) q p ( y j z k ) i = 1 n L n λ p ( z k ) , q ( p ( x i y j , z k ) ) k = 1 l p ( z k ) q i = 1 n L n λ p ( z k ) , q ( p ( x i z k ) ) ,
since j = 1 m p ( y j z k ) p ( x i y j , z k ) = p ( x i z k ). By p(yj |zk)qp(yj |zk) for all j, k and 1 ≤ q ≤ 2, and by the nonnegativity of the function Lnλp(zk),q, we have:
p ( y j z k ) q i = 1 n L n λ p ( z k ) , q ( p ( x i y j , z k ) ) p ( y j z k ) i = 1 n L n λ p ( z k ) , q ( p ( x i y j , z k ) ) .
Multiplying both sides by p(zk)q and summing over j and k in the above inequality, we have:
j = 1 m k = 1 l p ( z k ) q p ( y j z k ) q i = 1 n L n λ p ( z k ) , q ( p ( x i y j , z k ) ) j = 1 m k = 1 l p ( z k ) q p ( y j z k ) i = 1 n L n λ p ( z k ) , q ( p ( x i y j , z k ) ) .
From the two inequalities (84) and (85), we have:
j = 1 m k = 1 l p ( z k ) q p ( y j z k ) q i = 1 n L n λ p ( z k ) , q ( p ( x i y j , z k ) ) k = 1 l p ( z k ) q i = 1 n L n λ p ( z k ) , q ( p ( x i z k ) ) ,
which implies:
j = 1 m k = 1 l p ( y j , z k ) q i = 1 n L n λ p ( y j , z k ) , q ( p ( x i y j , z k ) ) k = 1 l p ( z k ) q i = 1 n L n λ p ( z k ) , q ( p ( x i z k ) ) ,
since p(yj, zk) ≤ p(zk) (because of j = 1 m p ( y j , z k ) = p ( z k ) for all j and k and the function Lnλp(zk),q is monotonically increasing in λp(zk) > 0, when 1 ≤ q ≤ 2. Thus, we have Hλ,q(X|Y, Z) ≤ Hλ,q(X|Z), which is equivalent to the inequality:
H λ , q ( X , Y , Z ) - H λ , q ( Y , Z ) H λ , q ( X , Z ) - H λ , q ( Z )
by Lemmas 4 and 6.

Remark 5

Passing to the limit λ → ∞ in Corollary 1 and Theorem 2, we recover the subadditivity and the strong subadditivity [20] for the Tsallis entropy:
T q ( X , Y ) T q ( X ) + T q ( Y )             ( q 1 )
and:
T q ( X , Y , Z ) + T q ( Z ) T q ( X , Z ) + T q ( Y , Z )             ( q 1 ) .
Thanks to the subadditivities, we may define the Tsallis mutual hypoentropies for 1 ≤ q ≤ 2 and λ > 0.

Definition 4

Let 1 ≤ q ≤ 2 and λ > 0. The Tsallis mutual hypoentropy is defined by:
I λ , q ( X ; Y ) H λ , q ( X ) - H λ , q ( X Y )
and the Tsallis conditional mutual hypoentropy is defined by:
I λ , q ( X ; Y Z ) H λ , q ( X Z ) - H λ , q ( X Y , Z ) .
From the chain rule given in Lemma 4, we find that the Tsallis mutual hypoentropy is symmetric, that is,
I λ , q ( X ; Y ) H λ , q ( X ) - H λ , q ( X Y ) = H λ , q ( X ) + H λ , q ( Y ) - H λ , q ( X , Y ) = H λ , q ( Y ) - H λ , q ( Y X ) = I λ , q ( Y ; X ) .
In addition, we have:
0 I λ , q ( X ; Y ) min  { H λ , q ( X ) , H λ , q ( Y ) }
from the subadditivity given in Theorem 1 and the nonnegativity of the Tsallis conditional hypoentropy. We also find Iλ,q(X, Y |Z) ≥ 0 from the strong subadditivity given in Theorem 2.
Moreover, we have the chain rule for the Tsallis mutual hypoentropies in the following.
I λ , q ( X ; Y Z ) = H λ , q ( X Z ) - H λ , q ( X Y , Z ) = H λ , q ( X Z ) - H λ , q ( X ) + H λ , q ( X ) - H λ , q ( X Y , Z ) = - I λ , q ( X ; Z ) + I λ , q ( X ; Y , Z ) .
From the strong subadditivity, we have Hλ,q(X|Y, Z) ≤ Hλ,q(X|Z); thus, we have:
I λ , q ( X ; Z ) I λ , q ( X ; Y , Z )
for 1 ≤ q ≤ 2 and λ > 0.

5. Jeffreys and Jensen–Shannon Hypodivergences

In what follows, we indicate extensions of two known information measures.

Definition 5 ([21,22])

The Jeffreys divergence is defined by:
J ( X Y ) D ( X Y ) + D ( Y X )
and the Jensen–Shannon divergence is defined as:
J S ( X Y ) 1 2 { D ( X X + Y 2 ) + D ( Y X + Y 2 ) }
= H ( X + Y 2 ) - 1 2 ( H ( X ) + H ( Y ) ) .
The Jensen–Shannon divergence was introduced in 1991 in [23], but its roots can be older, since one can see some analogous formulae used in thermodynamics under the name entropy of mixing (p. 598 in [24]), for the study of gaseous, liquid or crystalline mixtures.
Jeffreys and Jensen–Shannon divergences have been extended to the context of Tsallis theory in [25]:

Definition 6

The Jeffreys–Tsallis divergence is:
J q ( X Y ) S q ( X Y ) + S q ( Y X )
and the Jensen–Shannon–Tsallis divergence is:
J S q ( X Y ) 1 2 { S q ( X X + Y 2 ) + S q ( Y X + Y 2 ) } .
Note that:
J S q ( X Y ) T q ( X + Y 2 ) - 1 2 ( T q ( X ) + T q ( Y ) ) .
This expression was used in [26] as Jensen–Tsallis divergence.
In accordance with the above definition, we define the directed Jeffreys and Jensen–Shannon q-hypodivergence measures between two distributions and emphasize the mathematical significance of our definitions.

Definition 7

The Jeffreys–Tsallis hypodivergence is:
J λ , q ( X Y ) D λ , q ( X Y ) + D λ , q ( Y X )
and the Jensen–Shannon–Tsallis hypodivergence is:
J S λ , q ( X Y ) 1 2 { D λ , q ( X X + Y 2 ) + D λ , q ( Y X + Y 2 ) } .
Here, we point out that again, one has:
J S λ ( X Y ) = 1 2 K λ ( X X + Y 2 ) + 1 2 K λ ( Y X + Y 2 )
= F λ ( X + Y 2 ) - 1 2 ( F λ ( X ) + F λ ( Y ) ) ,
where:
J S λ ( X Y ) lim q 1 J S λ , q ( X Y ) .

Lemma 7

The following inequality holds:
D λ , q ( X X + Y 2 ) 1 2 D λ , 1 + q 2 ( X Y )
for q ≥ 0 and λ > 0.

Proof

Using the inequality between the arithmetic and geometric mean, one has:
D λ , q ( X X + Y 2 ) = - 1 λ i = 1 n ( 1 + λ p ( x i ) ) ln q ( 1 + λ p ( x i ) ) + ( 1 + λ p ( y i ) ) 2 1 + λ p ( x i )
- 1 λ i = 1 n ( 1 + λ p ( x i ) ) ln q 1 + λ p ( y i ) 1 + λ p ( x i )
= - 1 2 λ i = 1 n ( 1 + λ p ( x i ) ) ( 1 + λ p ( y i ) 1 + λ p ( x i ) ) 1 - 1 + q 2 - 1 1 - 1 + q 2
= 1 2 D λ , 1 + q 2 ( X Y ) .
Thus, the proof is completed.
In the limit λ → ∞, Lemma 7 recovers Lemma 3.4 in [25].

Lemma 8 ([25])

The function:
f ( x ) = - ln r 1 + exp q x 2
is concave for 0 ≤ rq.
The next two results of the present paper are stated in order to establish the counterpart of Theorem 3.5 in [25] for hypodivergences.

Proposition 7

It holds:
J S λ , q ( X Y ) 1 4 J λ , 1 + q 2 ( X Y )
for q ≥ 0 and λ > 0.

Proof

By the use of Lemma 7, one has:
2 J S λ , q ( X Y ) = D λ , q ( X X + Y 2 ) + D λ , q ( Y X + Y 2 )
1 2 D λ , 1 + q 2 ( X Y ) + 1 2 D λ , 1 + q 2 ( Y X )
= 1 2 J λ , 1 + q 2 ( X Y ) .
This completes the proof.

Proposition 8

It holds that:
J S λ , r ( X Y ) - n + λ λ ln r 1 + exp q ( - 1 2 · λ n + λ · J λ , q ( X Y ) ) 2
for 0 ≤ rq and λ > 0.

Proof

According to Lemma 8,
J S λ , r ( X Y ) = - n + λ 2 { i = 1 n 1 + λ p ( x i ) n + λ ln r 1 + exp q ln q ( 1 + λ p ( y i ) 1 + λ p ( x i ) ) 2 + i = 1 n 1 + λ p ( y i ) n + λ ln r 1 + exp q ln q ( 1 + λ p ( x i ) 1 + λ p ( y i ) ) 2 } - n + λ 2 λ { ln r 1 + exp q i = 1 n 1 + λ p ( x i ) n + λ ln q ( 1 + λ p ( y i ) 1 + λ p ( x i ) ) 2 + ln r 1 + exp q i = 1 n 1 + λ p ( y i ) n + λ ln q ( 1 + λ p ( x i ) 1 + λ p ( y i ) ) 2 } = - n + λ 2 { ln r 1 + exp q ( - λ n + λ D λ , q ( X Y ) ) 2 + ln r 1 + exp q ( - λ n + λ D λ , q ( Y X ) ) 2 } .
Then:
J S λ , r ( X Y ) - n + λ λ ln r 1 + exp q - λ n + λ ( D λ , q ( X Y ) + D λ , q ( Y X ) 2 ) 2 = - n + λ λ ln r 1 + exp q ( - 1 2 · λ n + λ · J λ , q ( X Y ) ) 2 .
Thus, the proof is completed.
We further define the dual symmetric hypodivergences.

Definition 8

The dual symmetric Jeffreys–Tsallis hypodivergence is defined by:
J λ , q ( d s ) ( X Y ) D λ , q ( X Y ) + D λ , 2 - q ( Y X )
and the dual symmetric Jensen–Shannon–Tsallis hypodivergence is defined by:
J S λ , q ( d s ) ( X Y ) 1 2 { D λ , q ( X X + Y 2 ) + D λ , 2 - q ( Y X + Y 2 ) } .
Using Lemma 7, we have the following inequality.

Proposition 9

It holds:
J S λ , q ( d s ) ( X Y ) 1 4 J λ , 1 + q 2 ( d s ) ( X Y )
for 0 ≤ q ≤ 2 and λ > 0.
In addition, we have the following inequality.

Proposition 10

It holds:
J S λ , q ( d s ) ( X Y ) - n + λ λ ln r 1 + exp q ( - λ 2 ( n + λ ) J λ , q ( X Y ) ) 2
for 1 < r ≤ 2, rq and λ > 0.

Proof

The proof can be done by similar calculations with Proposition 8, applying the facts (see Lemmas 3.9 and 3.10 in [25]) that expq(x) is a monotonically increasing function in q for x ≥ 0 and the inequality −ln2−r x ≤ −lnr x holds for 1 < r ≤ 2 and x > 0.

6. Concluding Remarks

In this paper, we introduced the Tsallis hypoentropy Hλ,q(X) and studied some properties of Hλ,q(X). We named Hλ,q(X) Tsallis hypoentropy because of the relation Hλ,q(X) ≤ Tq(X), which follows from the monotonicity in λ given in Proposition 5 and Lemma 5 for the case h(λ, q) = (1 + λ)1−q and the case h(λ, q) = λ1−q, respectively (this relation can be also proven directly). In this naming, we follow Ferreri, as he has termed Fλ(X) hypoentropy due to the relation Fλ(X) ≤ H(X).
The monotonicity of the hypoentropy and the Tsallis hypoentropy for λ > 0, indeed, is an interesting feature. It may be remarkable to examine the monotonicity of the Tsallis entropy for the parameter q ≥ 0. We find that the Tsallis entropy Tq(X) is monotonically decreasing with respect to q ≥ 0. Indeed, we find d T q ( X ) d q = j = 1 n p j q v q ( p j ) ( 1 - q ) 2, where vq(x) ≡ 1 − x1−q + (1 − q) log x (0 ≤ x ≤ 1). Since xqvq(x) = 0 for x = 0 and q > 0, we prove vq(x) ≤ 0 for 0 < x ≤ 1. We find d v q ( x ) d x = ( 1 - q ) ( 1 - x 1 - q ) x 0 when 0 < x ≤ 1; thus, we have vq(x) ≤ vq(1) = 0, which implies d T q ( X ) d q 0. This monotonicity implies the relations H(X) ≤ Tq(X) for 0 ≤ q < 1 and Tq(X) ≤ H(X) for q > 1 (these relations also follow from the inequalities log 1 x ln q 1 x for 0 ≤ q < 1, x > 0 and log 1 x ln q 1 x for q > 1, x > 0).
As other important results, we also gave the chain rules, subadditivity and the strong subadditivity of the Tsallis hypoentropies in the case of h(λ, q) = λ1−q. For the case of h(λ, q) = (1+λ)1−q, we can prove Hλ,q(Y |X) ≤ Hλ,q(X) and Hλ,q(X|Y, Z) ≤ Hλ,q(X|Z) for 1 ≤ q ≤ 2 in a similar way to the proofs of Theorems 1 and 2, since the function Snλ,q(x) defined in the proof of Proposition 5 is also nonnegative, monotone increasing and concave in x ∈ [0, 1], and we have H λ p ( x i ) , q ( Y x i ) = j = 1 m S n λ p ( x i ) , q ( p ( y j x i ) ) for all fixed xi. However, we cannot obtain the inequalities:
H λ , q ( X , Y ) H λ , q ( X ) + H λ , q ( Y )             ( 1 q 2 ) , H λ , q ( X , Y , Z ) + H λ , q ( Z ) H λ , q ( X , Z ) + H λ , q ( Y , Z )             ( 1 q 2 )
for h(λ, q) = (1 + λ)1−q, because the similar proof for the chain rules does not work well in this case.

Acknowledgments

The first author was partially supported by JSPS KAKENHI Grant Number 24540146.

Author Contributions

The work presented here was carried out in collaboration between all authors. The study was initiated by the second author. The first author played also the role of the corresponding author. All authors contributed equally and significantly in writing this article. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Boltzmann, L.E. Einige allgemeine Sätze über Wärmegleichgewicht. Wiener Berichte 1871, 63, 679–711. [Google Scholar]
  2. Gibbs, J.W. Elementary Principles in Statistical Mechanics—Developed with Especial Reference to the Rational Foundation of Thermodynamics; Charles Scribner’s Sons: New York, NY, USA, 1902. [Google Scholar]
  3. Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J 1948, 27. [Google Scholar]
  4. Hartley, R.V.L. Transmission of Information. Bell Syst. Tech. J 1928, 7, 535–563. [Google Scholar]
  5. Kullback, S.; Leibler, R.A. On the information and sufficiency. Ann. Math. Stat 1951, 17, 79–86. [Google Scholar]
  6. Rényi, A. On measures of information and entropy. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 20 June–30 July 1960; University of California Press: Berkeley, CA, USA, 1961; pp. 547–561. [Google Scholar]
  7. Havrda, J.; Charvat, F. Quantification methods of classification processes: Concept of structural alpha-entropy. Kybernetika 1967, 3, 30–35. [Google Scholar]
  8. Aczél, J.; Daróczy, Z. On Measures of Information and Their Characterizations; Academic Press: New York NY, USA, 1975. [Google Scholar]
  9. Ferreri, C. Hypoentropy and related heterogeneity, divergence and information measures. Statistica 1980, 2, 155–167. [Google Scholar]
  10. Tsallis, C. Possible generalization of Boltzmann–Gibbs statistics. J. Stat. Phys 1988, 52, 479–487. [Google Scholar]
  11. Tsallis, C. Generalized entropy-based criterion for consistent testing. Phys. Rev. E 1998, 58, 1442–1445. [Google Scholar]
  12. Borland, L.; Plastino, A.R.; Tsallis, C. Information gain within nonextensive thermostatistics. J. Math. Phys 1998, 39, 6490–6501. [Google Scholar]
  13. Gilardoni, G. On Pinsker’s and Vajda’s type inequalities for Csiszár’s f-divergence. IEEE Trans. Inf. Theory 2010, 56, 5377–5386. [Google Scholar]
  14. Rastegin, A.E. Bounds of the Pinsker and Fannes types on the Tsallis relative entropy. Math. Phys. Anal. Geom 2013, 16, 213–228. [Google Scholar]
  15. Csiszár, I. Information-type measures of difference of probability distributions and indirect observations. Stud. Sci. Math. Hung 1967, 2, 299–318. [Google Scholar]
  16. Csiszár, I. Axiomatic characterizations of information measures. Entropy 2008, 10, 261–273. [Google Scholar]
  17. Furuichi, S. On uniqueness theorems for Tsallis entropy and Tsallis relative entropy. IEEE Trans. Inf. Theory 2005, 47, 3638–3645. [Google Scholar]
  18. Daroczy, Z. Generalized information functions. Inf. Control 1970, 16, 36–51. [Google Scholar]
  19. Petz, D.; Virosztek, D. Some inequalities for quantum Tsallis entropy related to the strong subadditivity. Math. Inequal. Appl 2014, in press. [Google Scholar]
  20. Furuichi, S. Information theoretical properties of Tsallis entropies. J. Math. Phys 2006, 47. [Google Scholar] [CrossRef]
  21. Dragomir, S.S.; Šunde, J.; Buşe, C. New inequalities for Jeffreys divergence measure. Tamsui Oxf. J. Math. Sci 2000, 16, 295–309. [Google Scholar]
  22. Jeffreys, H. An invariant form for the prior probability in estimation problems. Proc. R. Soc. Lond. A 1946, 186, 453–461. [Google Scholar]
  23. Lin, J. Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 1991, 37, 145–151. [Google Scholar]
  24. Tolman, R.C. The Principles of Statistical Mechanics; Clarendon Press: London, UK, 1938. [Google Scholar]
  25. Furuichi, S.; Mitroi, F.-C. Mathematical inequalities for some divergences. Physica A 2012, 391, 388–400. [Google Scholar]
  26. Hamza, A.B. A nonextensive information-theoretic measure for image edge detection. J. Electron. Imaging 2006, 15, 13011.1–13011.8. [Google Scholar]
Back to TopTop