Abstract
The Khinchin–Shannon generalized inequalities for entropy measures in Information Theory, are a paradigm which can be used to test the Synergy of the distributions of probabilities of occurrence in physical systems. The rich algebraic structure associated with the introduction of escort probabilities seems to be essential for deriving these inequalities for the two-parameter Sharma–Mittal set of entropy measures. We also emphasize the derivation of these inequalities for the special cases of one-parameter Havrda–Charvat’s, Rényi’s and Landsberg–Vedral’s entropy measures.
1. Introduction
In the present contribution we derive the Generalized Khinchin–Shannon inequalities (GKS) [1,2] associated to entropy measures of the Sharma–Mittal (SM) set [3]. We stress that the derivations to be presented here are a tentative way of implementing the ideas of the literature on interdisciplinary topics of Statistical Mechanics and Theory of Information [4,5,6]. The algebraic structure of the escort probability distributions on these derivations seems to be essential, contrary to the intuitive derivation of the usual Khinchin–Shannon inequalities for the Gibbs–Shannon entropy measures. We start on Section 2 with the construction of a generic probabilistic space with their elements—the probabilities of occurrence—arranged on blocks of m rows and n columns. It then follows the introduction of the definitions of simple, joint, conditional and marginal probabilities through the use of the Bayes’ law. In Section 3, we make use of the assumption of concavity in order to unveil the Synergy of the distribution of values of Gibbs–Shannon entropy measures [2]. In Section 4, we present the same development but for the SM set of entropy measures, after the introduction of the concept of escort probabilities. We then specialize the derivations to Havrda–Charvat’s, Rényi’s and Landsberg–Vedral’s entropies [7,8,9]. A detailed study is then undertaken in this section to treat the eventual ordering between the probabilities of occurrence and their associated escort probabilities. This is then enough for deriving the GKS inequalities for the SM entropy measures. In Section 5, we present a proposal for Information measure associated to SM entropies and we derive its related inequalities [10]. At this point we stress once more the upsurge of the synergy effect on the comparison of the information obtained from the entropy calculated with joint probabilities of occurrence and the entropies corresponding to simple probabilities. In Section 6, we present an alternative derivation of the GKS inequalities based on Hölder inequalities [11]. These can provide, in association with Bayes’ law, the same assumptions of concavity which have been used in Section 3 and Section 4 and a consequent identical derivation of the GKS inequalities given in Section 4.
2. The Probability Space. Probabilities of Occurrence
We consider that the data could be represented on two-dimensional arrays of m rows and n columns. We then have blocks of data to undertake the statistical analysis. The joint probabilities of occurrence of a set of t variables in columns , respectively, are given by
where m is the number of rows in the subarray of the array , and is the number of occurrences of the set . The values assumed by the variables , , are respectively given by:
or,
There are then objects of t columns each, , and if the variables take on values , then we will have components for each of these objects.
Since:
we can write:
On the study of distributions of bases of nucleotides or distributions of amino acids in proteins, the related values of W are and , respectively.
The Bayes’ law for the probabilities of occurrence of Equation (1) is written as:
where stands for the conditional probability of occurrence of the values associated to the variables in the columns , respectively, if the values associated to in the jth column are given a priori. This also means that:
The marginal probabilities related to are then given by
3. The Assumption of Concavity and the Synergy of Gibbs–Shannon Entropy Measures
A concave function of several variables should satisfy the following inequality:
We shall apply Equation (10) to the Gibbs–Shannon entropies:
where Equation (12) stands for the definition of Gibbs–Shannon entropy which is related to the conditional probabilities . It is a measure of the uncertainty [2] on the distribution of probabilities of the columns , when we have previous information on the distribution of the column .
We now use the correspondences:
and we then have:
and
This means that the uncertainty of the distribution on the columns cannot be increased when we have previous information on the distribution of column .
From Equations (13) and (19), we then write:
and by iteration we get the Khinchin–Shannon inequality for the Gibbs–Shannon entropy measure:
The usual meaning given to Equation (21) is that the minimum of the information to be obtained from the analysis of the joint probabilities of a set of t columns is given by the sum of the informations associated with the t columns if considered as independent [1,2,10]. This is also seen as an aspect of Synergy [12,13] of the distribution of probabilities of occurrence.
4. The Assumption of Concavity and the Synergy of Sharma–Mittal (SM) Entropy Measures. The GKS Inequalities
We shall now use the assumption of concavity given by Equation (10) on Sharma–Mittal (SM) entropy measures:
where,
and r, s are non-dimensional parameters.
Analogously to Equation (12), we also introduce the “conditional entropy measure”
where
and stands for the escort probability
We have in general:
The inverse transformations are given by:
with
A range of variation for the parameters r, s of Sharma–Mittal entropies, Equation (22), should be derived from a requirement for strict concavity. In order to do so, let us remember that for each set of t columns ( subarray) there are m rows of t values each (t-sequences). We now denote these t-sequences by:
A sufficient requirement for strict concavity is the negative definiteness of the quadratic form associated to the Hessian matrix [14], whose elements are given by:
We then consider the m submatrices along the diagonal of the Hessian matrix. Their determinants should be alternately negative or positive according to whether their order is odd or even [15], respectively:
We then have generally:
This completes the proof.
We are now ready to use the concavity assumption, Equation (10) for deriving the GKS inequalities. In order to do so, we make the correspondences:
We can then write:
and
With the correspondences above, Equation (10) will turn into:
An additional information should be taken into consideration before we derive the GKS inequalities:
On each column, of a block, there will be values , of such that
and
After multiplying inequalities (48) and (49) by and , respectively, and summing up in and , respectively, we get:
and
From Equations (48) and (49), any sum over the values can be partitioned as sums over the sets of values and :
After applying the Bayes’ law, Equation (6), to the first term on the left hand side of Equations (53) and (54), we get:
where
and we have, , , according to Equations (48) and (49), respectively.
Since and , we have trivially that:
and
The set of inequalities, Equations (61), (64) and (69), or
and the set of inequalities, Equations (61), (65) and (70), or
can be arranged as the chains of inequalities
and
respectively.
The inequality is common to the two chains above and it can be written as:
From the definition of the escort probabilities, Equations (26) and (27), we can write the right-hand side of Equation (75), as:
We then get by iteration,
Equation (78) do correspond to the Generalized Khinchin–Shannon Inequalities (GKS) here derived for Sharma–Mittal entropies.
The same words which have been written after Equation (21), could be written also here for the Sharma–Mittal entropy measures as the aspect of Synergy is concerned. We will introduce a proposal for information measure to stress this aspect on the next section.
The Havrda–Charvat’s, Rényi’s and Landsberg–Vedral’s entropies are easily obtained by taking the convenient limits in Equation (24):
The Gibbs–Shannon entropy measure , Equation (11), is included in all these entropies through:
5. An Information Measure Proposal Associated to Sharma–Mittal Entropy Measures
We are looking for a proposal of information measure which can fulfill a requirement of clear interpretation of the upsurge of Synergy in a probabilistic distribution and is supported by the usual idea of entropy as a measure of uncertainty.
For the Sharma–Mittal set of entropy measures the proposal for the associated information measure would be:
where and are given by Equations (22) and (23). We then have from Equation (93)
The meaning of Equation (95) is that the minimum of information associated with t columns of probabilities of occurrence is given by the sum of informations associated to each column. This corresponds to the expression of Synergy of the distribution of probabilities of occurrences which we have derived on the previous section.
It seems worthwhile to derive yet another result which unveils once more the fundamental aspect of synergy of the distribution of probabilities of occurrence. From Equation (93), we have:
and we then write from the GKS inequalities, Equation (78):
We then get:
Equation (100) do correspond to another result which originates from the Synergy of the distribution of probabilities of occurrence. It can be written as: The minimum of the rate of information increase with decreasing entropy in probability distribution for sets of t columns, is given by the product of the rates of information increase pertaining to each of the t columns.
6. The Use of Hölder’s Inequality for an Alternative Derivation of the GKS Inequalities
We firstly note that:
and we now introduce the Hölder’s inequality: Ref. [11]
We can also write:
or
We now make the correspondences:
We take the s-power of the sides of Equation (101) and after using Equations (104) and (105) with and , we get:
After summing up in , we then have:
7. Concluding Remarks
It should be stressed that the introduction of escort probabilities has been efficient on the construction of generalized entropy measures. These can be used for the classification of databases in terms of their clustering as driven by their intrinsic synergy and the resulting formation of more complex structures like families and clans.
A fundamental aim would be the derivation of a dynamical theory which would be able to describe the process of formation of these structures. A theory based on the evolution of the entropy values on databases which we hope that could be realized by methods introduced by the exhaustive study of Fokker–Planck equations.
Some introductory results on this promising line of research have been already published [16] and a forthcoming publication of a comprehensive review will summarize all of them.
Author Contributions
Conceptualization, R.P.M. and S.C.d.A.N.; methodology, R.P.M. and S.C.d.A.N.; formal analysis, R.P.M. and S.C.d.A.N.; writing—original draft preparation, R.P.M.; writing—review and editing, R.P.M. and S.C.d.A.N.; visualization, R.P.M. and S.C.d.A.N.; supervision, R.P.M.; project administration, R.P.M. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| GKS | Generalized Khinchin–Shannon |
| SM | Sharma–Mittal |
References
- Mondaini, R.P.; de Albuquerque Neto, S.C. Khinchin–Shannon Generalized Inequalities for “Non-additive” Entropy Measures. In Trends in Biomathematics 2; Mondaini, R.P., Ed.; Springer International Publishing: Cham, Switzerland, 2019; pp. 177–190. [Google Scholar]
- Khinchin, A.I. Mathematical Foundations of Information Theory; Dover Publications: New York, NY, USA, 1957. [Google Scholar]
- Sharma, B.D.; Mittal, D.P. New Non-additive Measures of Entropy for Discrete Probability Distributions. J. Math. Sci. 1975, 10, 28–40. [Google Scholar]
- Volkenstein, M.V. Entropy and Information; Birkhäuser: Basel, Switzerland, 2009. [Google Scholar]
- Beck, C. Generalized Information and Entropy Measures in Physics. Contemp. Phys. 2009, 50, 495–510. [Google Scholar] [CrossRef]
- Lavenda, B.H. A New Perspective on Thermodynamics; Springer Science+Business Media: New York, NY, USA, 2010. [Google Scholar]
- Havrda, J.; Charvat, F. Quantification Method of Classification Processes. Concept of Structural α-entropy. Kybernetica 1967, 3, 30–35. [Google Scholar]
- Rényi, A. On Measures of Entropy and Information. In Contributions to the Theory of Statistics, Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 20 June–30 July 1960; Neyman, J., Ed.; University of California Press: Berkeley, CA, USA, 1961; Volume 1, pp. 547–561. [Google Scholar]
- Landsberg, P.T.; Vedral, V. Distributions and Channel Capacities in Generalized Statistical Mechanics. Phys. Lett. A 1997, 224, 326–330. [Google Scholar] [CrossRef]
- Mondaini, R.P.; de Albuquerque Neto, S.C. The Statistical Analysis of Protein Domain Family Distributions via Jaccard Entropy Measures. In Trends in Biomathematics 3; Mondaini, R.P., Ed.; Springer International Publishing: Cham, Switzerland, 2020; pp. 169–207. [Google Scholar]
- Hardy, G.H.; Littlewood, J.E.; Pólya, G. Inequalities; Cambridge University Press: London, UK, 1934. [Google Scholar]
- Ay, N.; Olbrich, E.; Bertschinger, N.; Jost, J. A Geometric Approach to Complexity. Chaos 2011, 21, 037103. [Google Scholar] [CrossRef] [PubMed]
- Olbrich, E.; Bertschinger, N.; Rauh, J. Information Decomposition and Synergy. Entropy 2015, 17, 3501–3517. [Google Scholar] [CrossRef] [Green Version]
- Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: New York, NY, USA, 2009. [Google Scholar]
- Marsden, J.E.; Tromba, A. Vector Calculus; W. H. Freeman and Company Publishers: New York, NY, USA, 2012. [Google Scholar]
- Mondaini, R.P.; de Albuquerque Neto, S.C. A Jaccard-like Symbol and its Usefulness in the Derivation of Amino Acid Distributions in Protein Domain Families. In Trends in Biomathematics 4; Mondaini, R.P., Ed.; Springer International Publishing: Cham, Switzerland, 2021; pp. 201–220. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).