Inequalities for Information Potentials and Entropies

: We consider a probability distribution ( p 0 ( x ) , p 1 ( x ) , . . . ) depending on a real parameter x . The associated information potential is S ( x ) : = ∑ k p 2 k ( x ) . The Rényi entropy and the Tsallis entropy of order 2 can be expressed as R ( x ) = − log S ( x ) and T ( x ) = 1 − S ( x ) . We establish recurrence relations, inequalities and bounds for S ( x ) , which lead immediately to similar relations, inequalities and bounds for the two entropies. We show that some sequences ( R n ( x )) n ≥ 0 and ( T n ( x )) n ≥ 0 , associated with sequences of classical positive linear operators, are concave and increasing. Two conjectures are formulated involving the information potentials associated with the Durrmeyer density of probability, respectively the Bleimann–Butzer–Hahn probability distribution.


Introduction
Entropies associated with discrete or continuous probability distributions are usually described by complicated explicit expressions, depending on one or several parameters. Therefore, it is useful to establish lower and upper bounds for them. Convexity-type properties are also useful: they embody valuable information on the behavior of the functions representing the entropies. This is why bounds and convexity-type properties of entropies, expressed by inequalities, are under an active study: see [1][2][3][4][5][6][7][8][9][10][11][12][13] and the references therein. Our paper is concerned with this kind of inequalities: we give new results and new proofs or improvements of some existing results, in the framework which is presented below. Let (p 0 (x), p 1 (x), . . .) be a probability distribution depending on a parameter x ∈ I, where I is a real interval. The associated information potential (also called index of coincidence, for obvious probabilistic reasons) is defined (see [14]), If p(t, x), t ∈ R, x ∈ I, is a probability density function depending on the parameter x, the associated information potential is defined as (see [14]), The information potential is the core concept of the book [14]. The reader can find properties, extensions, generalizations of S(x), as well as applications to Information theoretic learning. Other properties and applications can be found in the recent papers [15,16].
It is important to remark that the Rényi entropy and the Tsallis entropy can be expressed in terms of S(x) as R(x) = − log S(x), T(x) = 1 − S(x), x ∈ I.
So the properties of S(x) lead immediately to properties of R(x), respectively T(x).
On the other hand, we can consider the discrete positive linear operators where x k are given points in R, and the integral operators In both cases, f is a function from a suitable set of functions defined on R. In this paper, we consider classical operators of this kind, which are used in approximation theory.
Let us mention that the "degree of nonmultiplicativity" of the operator L can be estimated in terms of the information potential S(x): see [17] and the references therein.
In this paper, we will be concerned with a special family of discrete probability distributions, described as follows. Let For α ∈ R and k ∈ N 0 the binomial coefficients are defined as usual by Let n > 0 be a real number, k ∈ N 0 and x ∈ I c . Define Then ∑ ∞ k=0 p [c] n,k (x) = 1. Suppose that n > c if c ≥ 0, or n = −cl with some l ∈ N if c < 0. With this notation, we consider the discrete distribution of probability p  According to (1), the associated information potential, or index of coincidence, is The Rényi entropy and the Tsallis entropy corresponding to the same distribution of probability are defined, respectively (see (3)) R n,c (x) = − log S n,c (x) (9) and T n,c (x) = 1 − S n,c (x).
For c = −1 (6) reduces to the binomial distribution and (8) becomes The case c = 0 corresponds to the Poisson distribution (see (7)), which For c = 1, we have the negative binomial distribution, with The binomial, Poisson, respectively negative binomial distributions correspond to the classical Bernstein, Szász-Mirakyan, respectively Baskakov operators from approximation theory; all of them are of the form (4). In fact, the distribution p is instrumental for the construction of the family of positive linear operators introduced by Baskakov in [18]; see also [19][20][21][22][23]. As a probability distribution, the family of functions (p [c] n,k ) k=0,1,... was considered in [17,24]. The distribution n k corresponds to the Bleimann-Butzer-Hahn operators, while is connected with the Meyer-König and Zeller operators. The information potentials and the entropies associated with all these distributions were studied in [17]; see also [25][26][27]. It should be mentioned that they satisfy Heun-type differential equations: see [17]. We continue this study. To keep the same notation as in [17], let us return to (11)- (13) and denote F n (x) := S n,−1 (x), G n (x) := S n,1 (x), K n (x) := S n,0 (x).
Moreover, the information potential corresponding to (14) and (15) will be denoted by In Section 2, we present several relations between the functions F n (x), G n (x), U n (x), J n (x), as well as between these functions and the Legendre polynomials. By using the three-terms recurrence relations involving the Legendre polynomials, we establish recurrence relations involving three consecutive terms from the sequences (F n (x)), (G n (x)) , (U n (x)) , respectively (J n (x)). We recall also some explicit expressions of these functions. Section 3 is devoted to inequalities between consecutive terms of the above sequences; in particular, we emphasize that for fixed x, the four sequences are logarithmicaly convex and hence convex.
Other inequalities are presented in Section 4. All the inequalities can be used to get information about the Rényi entropies and Tsallis entropies connected with the corresponding probability distributions.
Section 5 contains new properties of the function U n (x) and a problem of its shape. Section 6 is devoted to some inequalities involving integrals of the form b a f 2 (x)dx in relation with certain combinatorial identities.
The information potential associated with the Durrmeyer density of probability is computed in Section 7. We recall a conjecture formulated in [24].
As already mentioned, all the results involving the information potential can be used to derive results about Rényi and Tsallis entropies. For the sake of brevity, we will study usually only the information potential.

Inequalities for Information Potentials
In studying a sequence of special functions, not only are recurrence relations important, but also inequalities connecting successive terms; in particular, inequalities showing that the sequence is (logarithmically) convex or concave. This section is devoted to such inequalities involving the sequences (F n (x)), (G n (x)), (U n (x)), and (J n (x)).
Combining (37) with (9) and (10), we obtain Corollary 1. The Rényi entropy R n (x) and the Tsallis entropy T n (x) corresponding to the binomial distribution with parameters n and x satisfy the inequalities: Theorem 3. The following inequalities hold: Proof. The proof is similar to that of Theorem 2, starting from (see ( [17], (48), (58), (63))): These integral representations, together with the representation of F n (x) given by (39), are consequences of the important results of Elena Berdysheva ([19], Theorem 1).
From (44)-(52), we can derive inequalities similar to (42) and (43), for the entropies associated with the probability distributions corresponding to U n (x), G n (x), and J n (x).

Other Inequalities
Besides their own interest, the next Theorems 4 and 6 will be instrumental in establishing new lower and upper bounds for the information potentials (F n (x)) n≥0 , (U n (x)) n≥0 , (G n (x)) n≥0 , (J n (x)) n≥0 , and consequently for the associated Rényi and Tsallis entropies.
Let us return to the information potential (8). According to ([17], (10)) we have for c = 0, S n,c (x) = 1 Let c < 0. Using (53) and Chebyshev's inequality for synchronous functions, we can write For c > 0, we use Chebyshev's inequality for asynchronous functions and obtain the reverse inequality. So we have If c > 0, the inequality is reversed.
and compare it with the first inequality in (59). According to ([17], (4.6), (4.2)), where (see also (87) and (84)) c n,j := 4 −n 2j j Using the weighted arithmetic mean-geometric mean inequality, we have and this is (63). Clearly, the first inequality in (59) provides a lower bound for F n (x), which is better than the lower bound provided by (63).

Theorem 6.
The information potential satisfies the following inequality for all c ∈ R: Proof. If c = 0, we can use (53) to get S m+n,c (x) = 1 Applying Chebyshev's inequality for synchronous functions, we obtain For c = 0, we have (see [17], (13)) With the same Chebyshev inequality, one obtains (64).
From Theorem 6, we derive Corollary 3. For the Rényi entropy R n,c (x) and the Tsallis entropy T n,c (x), we have Remark 5. The inequalities (65) and (66) express the subadditivity of the sequences (R n (x)) n≥0 and (T n (x)) n≥0 .

Remark 6.
From (64) with c = −1, we obtain Here is a probabilistic proof of this inequality. Let X m , X n , Y m , Y n be independent binomial random variables with the same parameter x ∈ [0, 1]. Then and consequently and this proves (67). It would be useful to have purely probabilistic proofs of other inequalities in this specific framework; they would facilitate a deeper understanding of the interplay between analytic proofs/results and probabilistic proofs/results.
Inequalities similar to (67) hold for G n (x) (apply (64) with c = 1) and for U n (x) and J n (x). Indeed, according to (19), for all x ∈ [0, 1), which implies From (68), by multiplication with 1 − t 1 + t 2 and using (33), we get Let us remark that the inequality (69) is stronger than the similar inequalities for F n , G n and U n .
In particular, Proof. Starting from (67), it suffices to use induction on n.
Similar results hold for G n (x), U n (x), J n (x), but we omit the details. However, let us remark that F 1 (x) = 1 − 2x(1 − x) and so the second inequality in (70) is the first inequality in (59).

More about U n (t)
This section contains some additional properties of the function U n defined initially by (16). Using the simple relation (33) connecting J n and U n , one can easily derive new properties of the function J n given by (17).
Remark 8. U n (0) = lim t→+∞ U n (t) = 1, (see (31)). These equalities, Theorem 7, and graphical experiments (see Figure 1) suggest that U n is convex on [0, t n ] and concave on [t n , +∞), for a suitable t n > 1. It would be interesting to have a proof for this shape of U n (t), and to find the value of t n . In order to compute U n (x), we have the explicit expressions (16) and (31), and the three terms of recurrence relation (29). In what follows, we provide two terms of recurrence relation.

Remark 9.
A recurrence relation similar to (75) and defining a sequence of Appell polynomials was instrumental in ( [31], Section 5) for studying the function F n .

Inequalities for the Integral of the Squared Derivative
Integrals of the form b a f 2 (x)dx are important for several applications; see, e.g., ([33], Section 3.10). In this section, we present bounds for such integrals using the logarithmic convexity of the functions F n , G n , K n . The results involve some combinatorial identities. Theorem 9. The following inequalities are valid for n = 0, 1, . . .: Proof. Let us return to (71). Integrating by parts, we obtain Remembering that F n (x) = S n,−1 (x) and using (11), we obtain Now (77) is a consequence of (80) and (81). The logarithmic convexity of the functions G n+1 and K n on [0, ∞) was proven in [34]. Using G n (x) = S n,1 (x) and (13), it is easy to derive and from (82) and (83), we obtain (78). The proof of (79) is similar and we omit it.

Information Potential for the Durrmeyer Density of Probability
Consider the Durrmeyer operators ; see, e.g., [36]. They are of the form (5).
The following numerical and graphical experiments support this conjecture (see Table 1 and Figure 2). Szász-Mirakyan, respectively classical Baskakov operators, we consider in our paper, from an analytic point of view, the distributions associated with the Bleimann-Butzer-Hahn, Meyer-König and Zeller, and Durrmeyer operators. Their study in a probabilistic perspective is deferred to a future paper. Another possible direction of further research is to investigate with our methods the distributions associated with other classical or more recent sequences of positive linear operators.
The information potential F n , G n , U n , and J n have strong relations with the Legendre polynomials. Quite naturally, the recurrence relations satisfied by these polynomials yield similar relations for the information potentials. It should be mentioned that the differential equation characterizing the Legendre polynomials was used in [17], in order to show that F n , G n , U n and J n satisfy Heun-type differential equations and consequently to obtain bounds for them. Other bounds are obtained in this paper, starting from the important integral representations given in [19]. They can be compared with other bounds from the literature, and this is another possible topic for further research.
For a fixed n, the convexity and even the logarithmic convexity of the function F n (x) were established in [17,27,31,32,34]. In this paper, we prove that for a fixed x, the sequence (F n (x)) n≥0 is logarithmically convex. Similar results hold for the other information potentials, and they have consequences concerning the associated entropies. However, we think that this direction of research can be continued and developed.
Two conjectures, accompanied by graphical experiments supporting them, are mentioned in our paper.
Author Contributions: These authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.