Abstract
In this paper we give a new refinement of the Lah–Ribarič inequality and, using the same technique, we give a refinement of the Jensen inequality. Using these results, a refinement of the discrete Hölder inequality and a refinement of some inequalities for discrete weighted power means and discrete weighted quasi-arithmetic means are obtained. We also give applications in the information theory; namely, we give some interesting estimations for the discrete Csiszár divergence and for its important special cases.
Keywords:
Lah–Ribarič inequality; Jensen inequality; Hölder’s inequalit; power mean; quasi-arithmetic mean; Csiszár divergence; Zipf–Mandelbrot law MSC:
26D15; 94A15
1. Introduction
Research of the classical inequalities, such as the Jensen, the Hölder and similar, has experienced great expansion. These inequalities first appeared in discrete and integral forms, and then many generalizations and improvements have been proved (for instance, see [1,2]). Lately, they are proven to be very useful in information theory (for instance, see [3]).
Let I be an interval in and a convex function. If is any n-tuple in and a nonnegative n-tuple such that , then the well known Jensen’s inequality
holds (see [4,5] or for example [6] (p. 43)). If f is strictly convex then (1) is strict unless for all .
Jensen’s inequality is one of the most famous inequalities in convex analysis, for which special cases are other well-known inequalities (such as Hölder’s inequality, A-G-H inequality, etc.). Beside mathematics, it has many applications in statistics, information theory, and engineering.
Strongly related to Jensen’s inequality is the Lah–Ribarič inequality (see [7]),
which holds when is a convex function on , is as in (1), is any n-tuple in and If f is strictly convex then (4) is strict unless for all .
The Lah–Ribarič inequality has been largely investigated and the interested reader can find many related results in the recent literature as well as in monographs such as [6,8,9]. It is interesting to find further refinements of the above inequality.
Our main result will be refinement of the inequality (2).
Using the same technique, we will give a refinement of the inequality (1) (see [10]).
In addition, we deal with the notion of f-divergences which measure the distance between two probability distributions. One of the most important is the Csiszár f-divergence, some special cases of which are the Shannon entropy, Jeffrey’s distance, Kullback–Leibler divergence, the Hellinger distance, and the Bhattacharyya distance. We deduce the relations for the mentioned f-divergences.
Let us say few words about the organization of the paper. In the following section we give a new refinement of the Lah–Ribarič inequality and state a known refinement of the Jensen inequality using the same technique. Using obtained results we give a refinement of the famous Hölder inequality and some new refinements for the weighted power means and quasi-arithmetic means. In addition, we give a historical remark regarding the Jensen–Boas inequality. In Section 3, we give the results for various f-divergences. These are further examined for the Zipf–Mandelbrot law.
2. New Refinements
The starting point of this consideration is the following lemma (see [11]).
Lemma 1.
Let f be a convex function on an interval I. If such that , then the inequality
holds for any .
The main result is a refinement of the Lah–Ribarič inequality (2). As we will see, its proof is based on the idea from the proof of the Jensen–Boas inequality.
Theorem 1.
Let be a convex function on , , is as in (1), be any n-tuple in and Let where for , , , for and , , for . Then
holds, where
Proof.
We have
Using the Lah–Ribarič inequality (2) for each of the subsets , we obtain
Using , and Lemma 1, we obtain
□
Remark 1.
If , the related term in the sum on the right-hand side of the first inequality in the proof of Theorem 1 remains unaltered (i.e., is equal to ).
Using the same technique, we obtain the following refinement of the Jensen inequality (1).
Theorem 2.
Let I be an interval in and a convex function. Let is any n-tuple in and a nonnegative n-tuple such that . Let where for , and . Then
holds.
Proof.
We have
Using Jensen’s inequality (1), we obtain
which is (4). □
We can find this idea for proving the refinement of our main result (and the refinement of the Jensen inequality) in one other well-known result (see [6] (pp. 55–60)).
In Jensen’s inequality there is a condition “ a nonnegative n-tuple such that ”. In 1919, Steffensen gave the same inequality (1) with slightly relaxed conditions (see [12]).
Theorem 3
(Jensen–Steffensen). If is a convex function, is a real monotonic n-tuple such that , and is a real n-tuple such that
One of many generalizations of the Jensen inequality is the Riemann–Stieltjes integral form of the Jensen inequality.
Theorem 4
(the Riemann–Stieltjes form of Jensen’s inequality). Let be a continuous convex function where I is the range of the continuous function . Inequality
holds provided that λ is increasing, bounded and .
Analogously, integral form of the Jensen–Steffensen’s inequality is given.
Theorem 5
(The Jensen–Steffensen). If f is continuous and monotonic (either increasing or decreasing) and λ is either continuous or of bounded variation satisfying
then (5) holds.
In 1970, Boas gave the integral analogue of Jensen–Steffensen’s inequality with slightly different conditions.
Theorem 6
(the Jensen–Boas inequality). If f is continuous or of bounded variation satisfying
for all , and , and if f is continuous and monotonic (either increasing or decreasing) in each of the intervals , then inequality (5) holds.
In 1982, J. Pečarić gave the following proof of the Jensen–Boas inequality.
Proof.
If with the notation
we have
Using Jensen’s inequality (1), we obtain
Using Jensen–Steffensen’s inequality (5) on each subinterval , we obtain
If , for some j, then on and we can easily prove that the Jensen–Boas inequality is valid. □
If we look at the previous proof, we see that the technique is the same as for our main result and the refinement of the Jensen inequality.
By using Theorem 2, we obtain the following refinement of the discrete Hölder inequality (see [13,14]).
Corollary 1.
Let such that . Let , such that . Then:
Proof.
We use Theorem 2 with . Then and from (4), we obtain
For the function from (7), we obtain
Multiplying with , and raising to the power of , we obtain
which is (6). □
Corollary 2.
Using the same conditions as in previous corollary for , , , we obtain
Proof.
First for . We use Theorem 2 with . Then and from (4), we obtain
For the function , we obtain
Multiplying with , and then with , we obtain
which is (8).
If , then , and the same result follows from symmetry (see comments in Corollary 1). □
It is interesting to show how the previously obtained results impact the study of the weighted discrete power means and the weighted discrete quasi-arithmetic means.
Let , , , , . The weighted discrete power means of order are defined as
Using Theorem 2, we obtain the following inequalities for the weighted discrete power means. Let us notice that left-hand side and right-hand side of both inequalities are the same; only mixed means in the middle, which are a refinement, change.
Corollary 3.
Let , , , . Let such that . Then
where , , , , for .
Proof.
We use Theorem 2 with for , , , , . From (4), we obtain
Substituting with , and then raising to the power , we obtain
which is (9).
Similarly, we use Theorem 2 with for , , , . We obtain
Substituting with , and then raising to the power , inequality (10) easily follows. Other cases follow similarly. □
Let I be an interval in . Let , , , . Then, for a strictly monotone continuous function , the discrete weighted quasi-arithmetic mean is defined as
Using Theorem 2, we obtain the following inequalities for quasi-arithmetic means.
Corollary 4.
Let I be an interval in . Let , , , . Let be a strictly monotone continuous function such that convex. Let where for , and . Then
where , , , , for .
Proof.
Theorem 2 with and gives
□
3. Applications in Information Theory
In this section we give basic results concerning the discrete Csiszár f-divergence. In addition, bounds for the divergence of the Zipf–Mandelbrot law are obtained.
Let us denote the set of all probability densities by , i.e., if for and .
In [15], Csiszár introduced the f-divergence functional as
where is a convex function, and it represents a “distance function” on the set of probability distributions .
In order to use nonnegative probability distributions in the f-divergence functional, we assume, as usual,
and the following definition of a generalized f-divergence functional is given.
Definition 1
(the Csiszár f-divergence functional). Let be an interval, and let be a function. Let be an n-tuple of real numbers and be an n-tuple of nonnegative real numbers such that for every . The Csiszár f-divergence functional is defined as
Theorem 7.
Let I be an interval in and a convex function. Let be an n-tuple of real numbers and be an n-tuple of nonnegative real numbers such that for every . Let where for , , and . Then
holds.
Proof.
Using Theorem 2 with and , we obtain
which is (13). □
Corollary 5.
If in the previous theorem we take and to be probability distributions, and we directly obtain the following result:
Theorem 8.
Let be a convex function on , . Let be an n-tuple of real numbers and be an n-tuple of nonnegative real numbers such that . Let where for , , , for and , , for . Then
holds.
Proof.
Using Theorem 1 with and , we obtain
which is (15). □
Corollary 6.
If, in the previous theorem, we take and to be probability distributions, we directly obtain the following result:
If and are probability distributions, the Kullback–Leibler divergence, also called relative entropy or KL divergence, is defined as
The next corollary provides us bounds for the Kullback–Leibler divergence of two probability distributions.
Corollary 7.
Let where for , and .
- Let and be n-tuples of nonnegative real numbers. Then
- Let and be probability distributions. Then
Proof.
Let and be an n-tuples of nonnegative real numbers. Since the function is convex, first inequality follows from Theorem 7 by setting .
The second inequality is a special case of the first inequality for probability distributions and . □
Corollary 8.
Let where for , and , for .
- Let and be n-tuples of nonnegative real numbers. Let , , and , for . Then
- Let and be probability distributions. Let , , and , for . Then
Proof.
Let and be an n-tuples of nonnegative real numbers. Since the function is convex, the first inequality follows from Theorem 8 by setting .
The second inequality is a special case of the first inequality for probability distributions and . □
Now we deduce the relations for some more special cases of the Csiszár f-divergence.
Definition 2
(the Shannon entropy). For a , the discrete Shannon entropy is defined as
Corollary 9.
Let . Let where for , and . Then
Proof.
Using Theorem 7 with and , we obtain
For inequality (17) follows. □
Corollary 10.
Let , , such that . Let where for , , , for and , , for . Then
holds.
Proof.
Using Theorem 8 with , and , we obtain
and (17) easily follows. □
Definition 3
(Jeffrey’s distance). For the the discrete Jeffrey distance is defined as
Corollary 11.
Let . Let where for , and . Then
Proof.
Using Corollary 5 with , we obtain
and (18) easily follows. □
Corollary 12.
Let , , such that . Let where for , , , for and , , for . Then
holds.
Proof.
Using Corollary 6 with , we obtain
and (19) easily follows. □
Definition 4
(the Hellinger distance). For the , the discrete Hellinger distance is defined as
Corollary 13.
Let . Let where for , and . Then
Proof.
Using Corollary 5 with (20) follows. □
Corollary 14.
Let , , such that . Let where for , , , for and , , for . Then
holds.
Proof.
Using Corollary 6 with (21) follows. □
Definition 5
(Bhattacharyya distance). For the , the discrete Bhattacharyya distance is defined as
Corollary 15.
Let . Let where for , and . Then
Proof.
Using Corollary 5 with (22) follows. □
Corollary 16.
Let , , such that . Let where for , , , for and , , for . Then
holds.
Proof.
Using Corollary 6 with (23) follows. □
Now we are going to derive the results from the Theorems (7) and (8) for the Zipf–Mandelbrot law.
The Zipf–Mandelbrot law is a discrete probability distribution and is defined by the following probability mass function:
where
is a generalization of the harmonic number and , and are parameters.
If we define as a Zipf–Mandelbrot law M-tuple, we have
where
and the Csiszár functional becomes
where , and the parameters are such that .
If and are both defined as Zipf–Mandelbrot law M-tuples, then the Csiszár functional becomes
where , and the parameters are such that .
Now, from Theorem 7, we have the following result.
Corollary 17.
Let I be an interval in and a convex function. Let be an n-tuple of real numbers and be an n-tuple of nonnegative real numbers such that for every . Let where for , . Suppose are such that , . Then
holds.
Proof.
If we define as a Zipf–Mandelbrot law n-tuple with parameters , then from Theorem 7 it follows
which is (24). □
From Theorem 8 we have the following result.
Corollary 18.
Let be a convex function on , . Let be an n-tuple of real numbers. Suppose are such that . Let where for , , , and , , for . Then
holds.
Proof.
If we define as a Zipf–Mandelbrot law n-tuple with parameters , then from Theorem 8 it follows
which is (25). □
Now, from Theorem 7, we also have the following result.
Corollary 19.
Let I be an interval in and a convex function. Let where for , . Suppose are such that , . Then
holds.
Proof.
If we define as a Zipf–Mandelbrot law n-tuples with parameters , then from Theorem 7, we obtain (26). □
From Theorem 8, we have the following result.
Corollary 20.
Let be a convex function on , . Suppose are such that . Let where for , , , and , , for . Then
holds.
Proof.
If we define as a Zipf–Mandelbrot law n-tuples with parameters , then from Theorem 8, we obtain (27). □
Since the minimal value for is and its maximal value is , from the right-hand side of (24) and the left-hand side of (25), we obtain the following result.
Corollary 21.
Let be a convex function on , . Let be an n-tuple of real numbers. Suppose are such that . Let where for , , , and , , for . Then
holds.
4. Conclusions
In this paper we have obtained a refinement of the Lah–Ribarič inequality and a refinement of the Jensen inequality which follows from using the Lah–Ribarič inequality and the Jensen inequality on disjunctive subsets of .
Using these results, we find a refinement of the discrete Hölder inequality and a refinement of some inequalities for the discrete weighted power means and the discrete weighted quasi-arithmetic means. In addition, some interesting estimations for the discrete Csiszár divergence and for its important special cases are obtained.
It would be interesting to see whether using this method one can give refinements of some other inequalities. In addition, we can try to use this method for refining the Jensen inequality and the Lah–Ribarič inequality for operators.
Author Contributions
All authors jointly worked on the results. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Dragomir, S.S.; Adil Khan, M.; Abathun, A. Refiment of Jensen’s integral inequality. Open Math. 2016, 14, 221–228. [Google Scholar] [CrossRef] [Green Version]
- Jessen, B. Bemaerkinger om konvekse Funktioner og Uligheder imellem Middelvaerdier. I. Mat. Tidsskr. B 1931, 17–29. [Google Scholar]
- Merhav, N. Reversing Jensen’s Inequality for Information-Theoretic Analyses. Information 2022, 13, 39. [Google Scholar] [CrossRef]
- Jensen, J.L.W.V. Om konvexe funktioner og uligheder mellem Middelvaerdier. Nyt. Tidsskr. Math. 1905, 16 B, 49–69. [Google Scholar]
- Nikolova, L.; Persson, L.E.; Varošanec, S. A new look at classical inequalities involving Banach lattice norms. J. Inequal. Appl. 2017, 2017, 302. [Google Scholar] [CrossRef] [PubMed]
- Pečarić, J.E.; Proschan, F.; Tong, Y.L. Convex Functions, Partial Orderings, and Statistical Applications; Mathematics in Science and Engineering, 187; Academic Press, Inc.: Boston, MA, USA, 1992; p. xiv+467. ISBN 0-12-549250-2. [Google Scholar]
- Lah, P.; Ribarič, M. Converse of Jensen’s inequality for convex functions. Univ. Beograd Publ. Elektrotehn. Fak. Ser. Mat. Fiz. 1973, 412–460, 201–205. [Google Scholar]
- Mitrinović, D.S.; Pečarić, J.E.; Fink, A.M. Classical and New Inequalities in Analysis; Mathematics and its Applications (East European Series), 61; Kluwer Academic Publishers Group: Dordrecht, The Netherlands, 1993; p. xviii+740. ISBN 0-7923-2064-6. [Google Scholar]
- Andrić, M.; Pečarić, J. Lah–Ribarič type inequalities for (h, g;m)-convex functions. Rev. Real Acad. Cienc. Exactas Fis. Nat. Ser. A Mat. 2022, 116, 39. [Google Scholar] [CrossRef]
- Popescu, P.G.; Sluşanschi, E.I.; Iancu, V.; Pop, F. A New Upper Bound for Shannon Entropy. A Novel Approach in Modeling of Big Data Applications. Concurr. Comput. Pract. Exp. 2016, 28, 351–359. [Google Scholar] [CrossRef]
- Pečarić, J.; Perić, J. Refinements of the integral form of Jensen’s and the Lah–Ribarič inequalities and applications for Csiszár divergence. J. Inequal. Appl. 2020, 2020, 108. [Google Scholar] [CrossRef] [Green Version]
- Constantin, P.; Niculescu; Persson, L.-E. Convex Functions and Their Applications. A Contemporary Approach; CMS Books in Mathematics; Springer: New York, NY, USA, 2005. [Google Scholar]
- Beckenbach, E.F.; Bellman, R. Inequalities; Springer: Berlin/Göttingen/Heidelberg, Germany, 1961. [Google Scholar]
- Hardy, G.H.; Littlewood, J.E. Pólya. In Inequalities; Cambridge Univ. Press: Cambridge, UK, 1934. [Google Scholar]
- Csiszár, I. Information-type measures of difference of probability functions and indirect observations. Studia Sci. Math. Hung. 1967, 2, 299–318. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).