Abstract
We give bounds on the difference between the weighted arithmetic mean and the weighted geometric mean. These imply refined Young inequalities and the reverses of the Young inequality. We also studied some properties on the difference between the weighted arithmetic mean and the weighted geometric mean. Applying the newly obtained inequalities, we show some results on the Tsallis divergence, the Rényi divergence, the Jeffreys–Tsallis divergence and the Jensen–Shannon–Tsallis divergence.
Keywords:
Young inequality; arithmetic mean; geometric mean; Heinz mean; Cartwright–Field inequality; Tsallis divergence; Rényi divergence; Jeffreys–Tsallis divergence; Jensen–Shannon–Tsallis divergence MSC:
26D15; 26E60; 94A17
1. Introduction
The Young integral inequality is the source of many basic inequalities. Young [1] proved the following: suppose that is an increasing continuous function such that and . Then:
with equality if . Such a gap is often used to define the Fenchel–Legendre divergence in information geometry [2,3]. For , in inequality (1), we deduce the classical Young inequality:
for all and with . The equality occurs if and only if .
Minguzzi [4] proved a reverse Young inequality in the following way:
for all and with .
The classical Young inequality (2) is rewitten as
by putting and . Putting again:
in the inequality (4), we obtain the famous Hölder inequality:
for and . Thus, the inequality (2) is often reformulated as
by putting (then ) in the inequality (4). It is notable that -divergence is related to the difference between the weighted arithmetic mean and the weighted geometric mean [5]. For , we deduce the inequality between the geometric mean and the arithmetic mean, . The Heinz mean ([6], Equation (3)) (see also [7]) is defined as and .
Especially, when we discuss Young inequality, we will refer to the last form. We consider the following expression:
which implies that and . We remark the following properties:
Cartwright–Field inequality (see, e.g., [8]) is often written as follows:
for and . This double inequality gives an improvement of the Young inequality, and at the same time, gives a reverse inequality for the Young inequality.
Kober proved in [9] a general result related to an improvement of the inequality between arithmetic and geometric means, which for implies the inequality:
where , and . This inequality was rediscovered by Kittaneh and Manasrah in [10] (See also [11]).
Finally, we found, in [12], another improvement of the Young inequality and a reverse inequality, given as
where , and with . It is remarkable that the inequalities (9) give a further refinement of (8), since and .
In [13], we also presented two inequalities which give two different reverse inequalities for the Young inequality:
and:
where , . See ([14], Chapter 2) for recent advances on refinements and reverses of the Young inequality.
The -divergence is related to the difference of a weighted arithmetic mean with a geometric mean [5]. We mention that the gap is used in information geometry to define the Fenchel–Legendre divergence [2,3]. We give bounds on the difference between the weighted arithmetic mean and the weighted geometric mean. These imply refined Young inequalities and the reverses of the Young inequality. We also studied some properties on the difference between the weighted arithmetic mean and the weighted geometric mean. Applying the newly obtained inequalities, we show some results on the Tsallis divergence, the Rényi divergence, the Jeffreys–Tsallis divergence and the Jensen–Shannon–Tsallis divergence [15,16]. The parametric Jensen–Shannon divergence can be used to detect unusual data, and this one can also use it as a means to perform the relevant analysis of fire experiments [17].
2. Main Results
We give estimates on and also study the properties of . We give the following estimates of first.
Theorem 1.
For and , we have:
where and .
Proof.
Theorem 2.
For and , we have:
Proof.
Then, we take the function defined by . By simple calculations we have:
So the function f is concave so that we can apply Hermite–Hadamard inequality [18]:
The left-hand side of the inequalities above shows:
Since the function is increasing, we have:
Integrating the above inequality by t from 1 to x, we obtain:
which implies:
□
Theorem 3.
For and , we have:
Proof.
We give two different proofs (I) and (II).
- (I)
- For or , we obtain equality in the relation from the statement. Thus, we assume and . It is easy to see that . Using the Lagrange theorem, there exists and between a and b such that . However, we have the inequality . Therefore, we deduce the inequality of the statement.
- (II)
- Using the Cartwright–Field inequality, we have:and if we replace p by , we deduce:for and . By summing up these inequalities, we proved the inequality of the statement:
□
Remark 1.
(i) From the proof of Theorem 3, we obtain , we deduce an estimation for the Heinz mean:
(ii) Since and , we have which is in fact the inequality given by Minguzzi (3).
Theorem 4.
Let and .
(i) For or , we have .
(ii) For or , we have .
Proof.
For or , we obtain equality in the relations from the statement. Thus, we assume and . However, we have:
We consider the function defined by . We calculate the derivatives of f, thus we have:
For and , we have , so, function is increasing, so we obtain , which implies that function f is increasing, so we have , which means that . For , we find that . For and , we have , so, function is increasing, so we obtain , which implies that function f is decreasing, so we have , which means that . For , we find that . In the analogous way, we show the inequality in (ii). □
Remark 2.
From (i) in Theorem 4 for and , we have , so we obtain:
which is just left hand side of Cartwright–Field inequality:
Therefore, it is quite natural to consider the following inequality:
whether it holds or not for a general case and . However, this inequality does not hold in general. We set the function:
Then, we have , and also , .
Theorem 5.
For and , we have:
where .
Proof.
For or or , we have equality. We assume and . If , then using Theorem 2, we have:
Using the Lagrange theorem, we obtain , where . For , we deduce , which means that . If and we replace p by , then Theorem 2 implies:
Using the Lagrange theorem, we obtain , where . For , we deduce , which means that . Taking into account the above considerations, we prove the statement. □
Corollary 1.
For and , we have:
where is given in Theorem 5.
Proof.
For or or , we have the equality. We assume and . If in inequality (18), we replace by , we deduce:
Consequently, we prove the inequalities of the statement. □
Theorem 6.
For and , we have:
Proof.
For or or , we have equality in the relation from the statement. We assume and . We consider function defined by , . For , we have , which implies that f is increasing, so we deduce . For , we have , which implies that f is decreasing, so we obtain Therefore, we find the following inequality:
Multiplying the above inequality by , we have:
which is equivalent to the inequality:
for all and . Therefore, if we take in the above inequality and after some calculations, we deduce the inequality of the statement. □
Corollary 2.
For and , we have:
Proof.
For or or , we have the equality. We assume and . If in inequality (21), we exchange a with b, we deduce:
However, , so we have:
Consequently, we prove the inequality of the statement. □
3. Applications to Some Divergences
The Tsallis divergence (e.g., [19,20]) is defined for two probability distributions and with and for all as
The Rényi divergence (e.g., [21]) is also denoted by
We see in (e.g., [22]) that:
It is also known that:
where is the standard divergence (KL information, reltative entropy). The Jeffreys divergence (see [22,23]) is defined by and the Jensen–Shannon divergence [15,16] is defined by
In [24], the Jeffreys and the Jensen–Shannon divergence are extended to biparametric forms. In [23], Furuichi and Mitroi generalizes these divergences to the Jeffreys–Tsallis divergence, which is given by and to the Jensen–Shannon–Tsallis divergence, which is defined as
Several properties of divergences can be extended in the operator theory [25].
For the Tsallis divergence, we have the following relations.
Theorem 7.
For two probability distributions and with and for all , we have:
Proof.
Remark 3.
(ii)From (23), we have:
where we used the inequality for all . Thus, we deduce the inequalities:
and:
Combining (25) with Theorem 7, we therefore have the following result for the Rényi divergence:
We give the relation between the Jeffreys–Tsallis divergence and the Jensen–Shannon–Tsallis divergence:
Theorem 8.
For two probability distributions and with and for all , we have:
where with .
Proof.
We consider the function defined by , which is concave for . Therefore, we have , which implies the following inequalities:
From the definition of the Tsallis divergence, we deduce the inequality:
which is equivalent to the relation of the statement. For the case of , the function is convex in . Similarly, we have the statement, taking into account that . □
Remark 4.
In the limit of in (26), we then obtain:
We give the bounds on the Jeffreys–Tsallis divergence by using the refined Young inequality given in Theorem 1. In [26], we found the Battacharyya coefficient defined as
which is a measure of the amount of overlapping between two distributions. This can be expressed in terms of the Hellinger distance between the probability distributions and , which is given by
where the Hellinger distance ([26,27]) is a metric distance and defined by
Theorem 9.
For two probability distributions and with and for all , and , we have:
where and .
Proof.
For , we obtain the equality. Now, we consider . Using Theorem 1 for and , , we deduce:
where . If we replace q by and taking into account that and , then we have:
Taking the sum on , we find the inequalities:
which is equivalent to the inequalities in the statement. □
Remark 5.
We give the further bounds on the Jeffreys–Tsallis divergence by the use of Theorem 5 and Corollary 2:
Theorem 10.
For two probability distributions and with and for all , and , we have:
where is given in Theorem 5.
Proof.
Putting , and in (20), we deduce:
and:
Taking into account that:
and by taking the sum on , we have:
we prove the lower bounds of . To prove the upper bound of , we put , and in inequality (22). Then, we deduce:
By taking the sum on , we find:
Consequently, we prove the inequalities of the statement. □
We also give the further bounds on the Jeffreys–Tsallis divergence by the use of Cartwright–Field inequality given in (7).
Theorem 11.
For two probability distributions and with and for all , and , we have:
Proof.
For , we have the equality. We assume . By direct calculations, we have:
Using inequality (7), we deduce:
and:
From the above inequalities, we have the statement, by summing on . □
It is quite natural to extend the Jensen–Shannon–Tsallis divergence to the following form:
where . We call this the v-weighted Jensen–Shannon–Tsallis divergence. For , we find that which is the Jensen–Shannon–Tsallis divergence. For this quantity , we can obtain the following result in a way similar to the proof of the Theorem 11.
Proposition 1.
For two probability distributions and with and for all , and , we have:
Proof.
We calculate as
Using inequality (7), we deduce:
and:
Multiplying v and by the above inequalities, respectively, and then taking the sum on , we obtain the statement. □
4. Conclusions
We obtained new inequalities which improve classical Young inequality by analytical calculations with known inequalities. We also obtained some bounds on the Jeffreys–Tsallis divergence and the Jensen–Shannon–Tsallis divergence. At this point, we do not clearly know whether the obtained bounds will play any role in the information theory. However, if there exists a purpose to find the meaning of the parameter q in divergences based on Tsallis divergence, then we may state that almost all theorems (except for Theorem 8) hold for . In the first author’s previous studies [19,28], some results related to Tsallis divergence (relative entropy) are still true for , while some results related to Tsallis entropy are still true for . In this paper, we treated the Tsallis type divergence so it is shown that almost all results are true for . This insight may give a rough meaning of the parameter q.
Since our results in Section 3 are based on the inequalities in Section 2, we summarized the tightness for our obtained inequalities in Section 2. The double inequality (12) is a counterpart of the double inequality (9) for . Therefore, they can not be compared wit each other from the point of view on the tightness, since the conditions are different. The double inequality (12) was used to obtain Theorem 9. The double inequality (15) is essentially a Cartwright–Field inequality in itself, and it was used to obtain Theorem 7 as the first result in Section 3. The results in Theorem 4 are mathematical properties on . The inequalities given in (18) gave an improvement of the left-hand side in the inequality (7) for the case and we obtained Theorem 10 by (18). We obtained the upper bound of as a counterpart of (18) for a general . This is used to prove Corollary 2 which was used to prove Theorem 10. However, we found that the upper bound of given in (22) is not tighter than the one in (15).
Finally, Theorem 8 can be obtained from the convexity/concavity of the function . This study will be continued in order to obtain much sharper bounds. We extend the Jensen–Shannon–Tsallis divergence to the following:
and we call this the v-weighted Jensen–Shannon–Tsallis divergence. For , we find that which is the Jensen–Shannon–Tsallis divergence. For this quantity, as a information-theoretic divergence measure , we obtained several characterizations.
Author Contributions
Conceptualization, S.F. and N.M.; investigation, S.F. and N.M.; writing—original draft preparation, S.F. and N.M.; writing—review and editing, S.F.; funding acquisition, S.F. and N.M. All authors have read and agreed to the published version of the manuscript.
Funding
The author (S.F.) was partially supported by JSPS KAKENHI Grant Number 21K03341.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Acknowledgments
The authors would like to thank the referees for their careful and insightful comments to improve our manuscript.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Young, W.H. On classes of summable functions and their Fourier series. Proc. R. Soc. Lond. Ser. A 1912, 87, 225–229. [Google Scholar]
- Blondel, M.; Martins, A.F.T.; Niculae, V. Learning with Fenchel-Young Losses. J. Mach. Learn. Res. 2020, 21, 1–69. [Google Scholar]
- Nielsen, F. On Geodesic Triangles with Right Angles in a Dually Flat Space. In Progress in Information Geometry: Theory and Applications; Springer: Berlin/Heidelberger, Germany, 2021; pp. 153–190. [Google Scholar]
- Minguzzi, E. An equivalent form of Young’s inequality with upper bound. Appl. Anal. Discrete Math. 2008, 2, 213–216. [Google Scholar] [CrossRef][Green Version]
- Nielsen, F. The α-divergences associated with a pair of strictly comparable quasi-arithmetic means. arXiv 2020, arXiv:2001.09660. [Google Scholar]
- Bhatia, R. Interpolating the arithmetic–geometric mean inequality and its operator version. Linear Alg. Appl. 2006, 413, 355–363. [Google Scholar] [CrossRef][Green Version]
- Furuichi, S.; Ghaemi, M.B.; Gharakhanlu, N. Generalized reverse Young and Heinz inequalities. Bull. Malays. Math. Sci. Soc. 2019, 42, 267–284. [Google Scholar] [CrossRef]
- Cartwright, D.I.; Field, M.J. A refinement of the arithmetic mean-geometric mean inequality. Proc. Am. Math. Soc. 1978, 71, 36–38. [Google Scholar] [CrossRef]
- Kober, H. On the arithmetic and geometric means and Hölder inequality. Proc. Am. Math. Soc. 1958, 9, 452–459. [Google Scholar]
- Kittaneh, F.; Manasrah, Y. Improved Young and Heinz inequalities for matrix. J. Math. Anal. Appl. 2010, 361, 262–269. [Google Scholar] [CrossRef]
- Bobylev, N.A.; Krasnoselsky, M.A. Extremum Analysis (Degenerate Cases); Institute of Control Sciences: Moscow, Russia, 1981; 52p. (In Russian) [Google Scholar]
- Minculete, N. A refinement of the Kittaneh–Manasrah inequality. Creat. Math. Inform. 2011, 20, 157–162. [Google Scholar]
- Furuichi, S.; Minculete, N. Alternative reverse inequalities for Young’s inequality. J. Math. Inequal. 2011, 5, 595–600. [Google Scholar] [CrossRef]
- Furuichi, S.; Moradi, H.R. Advances in Mathematical Inequalities; De Gruyter: Berlin, Germany, 2020. [Google Scholar]
- Lin, J. Divergence measures based on the Shannon entropy. IEEE Trans. Inform. Theory 1991, 37, 145–151. [Google Scholar] [CrossRef]
- Sibson, R. Information radius. Z. Wahrscheinlichkeitstheorie Verw Gebiete 1969, 14, 149–160. [Google Scholar] [CrossRef]
- Mitroi-Symeonidis, F.C.; Anghel, I.; Minculete, N. Parametric Jensen-Shannon Statistical Complexity and Its Applications on Full-Scale Compartment Fire Data. Symmetry 2020, 12, 22. [Google Scholar] [CrossRef]
- Niculescu, C.P.; Persson, L.-E. Convex Functions and Their Applications, 2nd ed.; Springer: Berlin/Heidelberger, Germany, 2018. [Google Scholar]
- Furuichi, S.; Yanagi, K.; Kuriyama, K. Fundamental properties of Tsallis relative entropy. J. Math. Phys. 2004, 45, 4868–4877. [Google Scholar] [CrossRef]
- Tsallis, C. Generalized entropy-based criterion for consistent testing. Phys. Rev. E 1998, 58, 1442–1445. [Google Scholar] [CrossRef]
- Aczél, J.; Daróczy, Z. On Measures of Information and Their Characterizations; Academic Press: Cambridge, MA, USA, 1975. [Google Scholar]
- Furuichi, S.; Minculete, N. Inequalities related to some types of entropies and divergences. Physica A 2019, 532, 121907. [Google Scholar] [CrossRef]
- Furuichi, S.; Mitroi, F.-C. Mathematical inequalities for some divergences. Physica A 2012, 391, 388–400. [Google Scholar] [CrossRef]
- Mitroi, F.C.; Minculete, N. Mathematical inequalities for biparametric extended information measures. J. Math. Ineq. 2013, 7, 63–71. [Google Scholar] [CrossRef]
- Moradi, H.R.; Furuichi, S.; Minculete, N. Estimates for Tsallis relative operator entropy. Math. Ineq. Appl. 2017, 20, 1079–1088. [Google Scholar] [CrossRef]
- Lovričević, N.; Pečarić, D.; Pečarić, J. Zipf-Mandelbrot law, f-divergences and the Jensen-type interpolating inequalities. J. Inequal. Appl. 2018, 2018, 36. [Google Scholar] [CrossRef] [PubMed]
- Van Erven, T.; Harremöes, P. Rényi Divergence and Kullback -Leibler Divergence. IEEE Trans. Inf. Theory 2014, 60, 3797–3820. [Google Scholar] [CrossRef]
- Furuichi, S. Information theoretical properties of Tsallis entropies. J. Math. Phys. 2006, 47, 023302. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).