Next Article in Journal
Simulation-Based Inference of Bayesian Hierarchical Models While Checking for Model Misspecification
Previous Article in Journal
Bulk and Point Defect Properties in α-Zr: Uncertainty Quantification on a Semi-Empirical Potential
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Comparing the Zeta Distributions with the Pareto Distributions from the Viewpoint of Information Theory and Information Geometry: Discrete versus Continuous Exponential Families of Power Laws †

Sony Computer Science Laboratories Inc., Tokyo 141-0022, Japan
Presented at the 41st International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Paris, France, 18–22 July 2022.
Phys. Sci. Forum 2022, 5(1), 2; https://doi.org/10.3390/psf2022005002
Published: 31 October 2022

Abstract

:
We consider the zeta distributions, which are discrete power law distributions that can be interpreted as the counterparts of the continuous Pareto distributions with a unit scale. The family of zeta distributions forms a discrete exponential family with normalizing constants expressed using the Riemann zeta function. We present several information-theoretic measures between zeta distributions, study their underlying information geometry, and compare the results with their continuous counterparts, the Pareto distributions.

1. Introduction

Zeta distributions [1,2] are parametric discrete distributions with probability mass functions indexed by a scalar parameter s ( 1 , ) whose support is the set of positive integers N :
p s ( x ) = Pr [ X = x ] 1 x s , x X = N = { 1 , 2 , } .
The normalization function ζ ( s ) of the zeta distributions p s ( x ) = 1 ζ ( s ) 1 x s such that x N p s ( x ) = 1 is the real Riemann zeta function [3,4,5]:
ζ ( s ) = i = 1 1 i s = 1 + 1 2 s + 1 3 s + , s > 1 .
The set of zeta distributions Z = { p s ( x ) : s ( 1 , ) } forms a discrete exponential family [6,7] with natural parameter θ ( s ) = s lying in the natural parameter space Θ = ( 1 , ) , the sufficient statistic t ( x ) = log x , and the cumulant function or log-normalizer F ( θ ) = log ζ ( θ ) . Therefore, it follows from the theory of exponential families [7] that log ζ ( θ ) is a strictly convex and real analytic function (see Figure 1). Thus, the pmf of zeta distributions can be rewritten in the canonical form of exponential families as:
p s ( x ) = exp θ ( s ) t ( x ) F ( θ ( s ) ) .
The characteristic function is thus ϕ s ( t ) = ζ ( s + i t ) ζ ( s ) .
Thus, a zeta distribution p s ( x ) can be interpreted as the discrete equivalent of a Pareto distribution q s ( x ) of scale 1 and shape s 1 with probability density function q s ( x ) = s 1 x s for x > 1 (see Table 1).
The zeta function is known to be irrational at many positive odd integers [8,9,10] and can be calculated using Bernoulli numbers [11] for positive even integers: ζ ( 2 n ) = ( 1 ) n + 1 B 2 n ( 2 π ) 2 n 2 ( 2 n ) ! , n N . The zeta function can be calculated fast [12] and precisely [13]. The derivatives of the zeta function have also been studied [12,14].
The zeta distributions are related to the Zipf distributions [15] p s , N ( x ) 1 x s for x { 1 , , N } and the Zipf–Mandelbrot distributions [16,17] p s , q , N ( x ) 1 ( x + q ) s for x { 1 , , N } , which play an important role in quantitative linguistics. See [6] for more details. The Zipf distributions and the Zipf–Mandelbrot distributions both have finite support and can be interpreted as truncated zeta distributions (right truncation for Zipf distributions and both left & right truncations for the Zipf–Mandelbrot distributions) with normalizing constants, which can be calculated approximately using properties of the zeta function [18]. Left-only truncations of the Zeta distributions are called Hurwitz zeta distributions [19]. Similarly, truncated Pareto distributions are used in applications [20]. Notice that truncated distributions of an exponential family with fixed truncation support form another exponential family [21]. The zeta distributions are infinite divisible [19,22]: A random variable following a zeta distribution can be expressed as the probability distribution of the sum of an arbitrary number of independent and identically distributed random variables. In applications, it is important to quantitatively discriminate between zeta distributions (see, for example [23,24] or [25]). Mixtures of zeta distributions have also been used to model social networks [26]. In general, products of exponential families yield other exponential families. The products of d zeta distributions form an exponential family called the zeta-star distributions [22].
In this paper, we first study various information-theoretic measures between zeta distributions by considering them as a discrete exponential family [7]: That is, we consider the α -divergences [27] between zeta distributions in Section 2, and study their limit Kullback–Leibler-oriented divergences when α 1 and α 0 in Section 3. We then compare these results with the counterpart results obtained for the continuous exponential family of Pareto distributions in Section 4. Finally, we conclude this work in Section 5.

2. Amari’s α -Divergences and Sharma–Mittal Divergences

To measure the dissimilarity between two zeta distributions p s 1 and p s 2 , one can use the α-divergences [27] defined for a real α ( 0 , 1 ) as follows:
D α [ p s 1 : p s 2 ] : = 1 α ( 1 α ) 1 I α [ p s 1 : p s 2 ] = D 1 α [ p s 2 : p s 1 ] ,
where
I α [ p 1 , p 2 ] : = i = 1 p 1 ( x ) α p 2 ( x ) 1 α = I 1 α [ p 2 : p 1 ] ,
is the α -Bhattacharyya coefficient (a similarity measure also called an affinity coefficient).
It follows from [28] that the skewed Bhattacharyya coefficient amounts to a skewed Jensen divergence between the natural parameters of the exponential family E :
I α [ p s 1 : p s 2 ] = exp J F , α ( s 1 : s 2 ) ,
where J F , α is the skewed Jensen divergence induced by a strictly convex and smooth convex function F ( θ ) :
J F , α ( s 1 : s 2 ) : = α F ( s 1 ) + ( 1 α ) F ( s 2 ) F ( α s 1 + ( 1 α ) s 2 ) 0 ,
= log ζ ( s 1 ) α ζ ( s 2 ) 1 α ζ ( α s 1 + ( 1 α ) s 2 ) .
Thus, we have the α -divergences between two zeta distributions p s 1 and p s 2 available in closed form.
Theorem 1
( α -divergences between two zeta distributions). The α-divergence for α ( 0 , 1 ) between two zeta distributions p s 1 and p s 2 is:
D α [ p s 1 : p s 2 ] = 1 α ( 1 α ) 1 ζ ( α s 1 + ( 1 α ) s 2 ) ζ ( s 1 ) α ζ ( s 2 ) 1 α .
It follows that when s 1 , s 2 , and α s 1 + ( 1 α ) s 2 are all positive even integers, we can evaluate exactly the α -divergences between p s 1 and p s 2 .
Example 1.
Consider s 1 = 4 and s 2 = 12 with α = 1 2 so that α s 1 + ( 1 α ) s 2 = 8 . Using the formula [11] ζ ( 2 n ) = ( 1 ) n + 1 B 2 n ( 2 π ) 2 n 2 ( 2 n ) ! , n N where B 2 n denotes the Bernoulli numbers, the zeta functions can be calculated exactly at 4, 8 and 12: ζ ( 4 ) = π 4 90 , ζ ( 8 ) = π 8 9450 , and ζ ( 12 ) = 691 π 12 638512875 . The α-divergence for α = 1 2 is the squared Hellinger divergence D 1 2 [ p s 1 , p s 2 ] = i = 1 p s 1 ( i ) p s 2 ( i ) 2 . Thus, we find the exact squared Hellinger divergence: D 1 2 [ p 4 , p 12 ] = 4 1 3 715 6910 0.139929 .
Let us report another example where the squared Hellinger divergence is expressed using the zeta function:
Example 2.
We consider s 1 = 3 , s 2 = 7 and α = 1 2 so that α s 1 + ( 1 α ) s 2 = 5 . Then, we have D 1 2 [ p 3 , p 7 ] = 4 1 ζ ( 5 ) ζ ( 3 ) ζ ( 7 ) 0.23261
Since lim α 1 D α [ p s 1 : p s 2 ] = D KL [ p s 1 : p s 2 ] is the Kullback–Leibler divergence [27] (KLD)
D KL [ p s 1 : p s 2 ] : = i = 1 p s 1 ( i ) log p s 1 ( i ) p s 2 ( i ) ,
we can approximate the KLD by D 1 ϵ [ s 1 : s 2 ] for a small value of ϵ (say, ϵ = 10 3 ) using fast methods to compute the zeta function [12].
Corollary 1
(Approximation of the Kullback–Leibler divergence). The Kullback–Leibler divergence between two zeta distributions p s 1 and p s 2 can be approximated for small values ϵ > 0 by
D KL [ p s 1 : p s 2 ] D 1 ϵ [ p s 1 : p s 2 ] = 1 ϵ ( 1 ϵ ) 1 ζ ( ( 1 ϵ ) s 1 + ϵ s 2 ) ζ ( s 1 ) 1 ϵ ζ ( s 2 ) ϵ .
Example 3.
We let 1 ϵ = 0.99 , 1 ϵ = 0.999 , 1 ϵ = 0.9999 , 1 ϵ = 0.99999 , and find the following numerical approximations:
D KL [ p s 1 : p s 2 ] 1 ϵ = 0.99 0.473 D KL [ p s 1 : p s 2 ] 1 ϵ = 0.999 0.482 D KL [ p s 1 : p s 2 ] 1 ϵ = 0.9999 0.483 D KL [ p s 1 : p s 2 ] 1 ϵ = 0.99999 0.483
We can also calculate the KLD D KL [ p s 1 X 1 : p s 2 X 2 ] between two truncated zeta distributions with nested supports X 1 X 2 . See [21]. A truncated zeta distribution on the support { a , a + 1 , , b } N (with b > a ) has pmf p s a , b ( x ) = p s ( x ) Φ s ( b ) Φ s ( a ) where Φ s ( u ) is the cumulative distribution function Φ s ( u ) = x { 1 , , u } p s ( x ) = 1 ζ ( s ) x { 1 , , u } 1 x s .
The Chernoff information [29] is defined by C [ p 1 , p 2 ] = log min α ( 0 , 1 ) I α [ p 1 , p 2 ] . The unique optimal value α * maximizing the Chernoff α -divergences C α [ p 1 , p 2 ] = log I α [ p 1 , p 2 ] is called the Chernoff exponent [29] due to its role in bounding the probability of error in Bayesian hypothesis testing. When both pdfs or pmfs belong to the same exponential family, we have [29]
C [ p θ 1 , p θ 2 ] = J F ( θ 1 : ( θ 1 θ 2 ) α * ) = B F ( θ 1 : ( θ 1 θ 2 ) α * ) = B F ( θ 2 : ( θ 1 θ 2 ) α * ) ,
where B F denotes the Bregman divergence (corresponding to the KLD) and ( θ 1 θ 2 ) α * = α * θ 1 + ( 1 α * ) θ 2 . For a uniorder exponential family such as the zeta distributions, a closed-form formula for the optimal Chernoff exponent α * is reported in [29]: α * = F 1 F ( θ 2 ) F ( θ 1 ) θ 2 θ 1 θ 1 θ 2 θ 1 .
The Sharma–Mittal divergences [30] between two densities p and q is a biparametric family of relative entropies defined by
D α , β [ p : q ] = 1 β 1 p ( x ) α q ( x ) 1 α d x 1 β 1 α 1 , α > 0 , α 1 , β 1 .
The Sharma–Mittal divergence is induced from the Sharma–Mittal entropies, which unify the extensive Rényi entropies with the non-extensive Tsallis entropies [30]. The Sharma–Mittal divergences include the Rényi divergences ( β 1 ) and the Tsallis divergences ( β α ), and in the limit case of α , β 1 , the Kullback–Leibler divergence [31]. When both densities p = p θ 1 and q = p θ 2 belong to the same exponential family, we have the following closed-form formula [31]:
D α , β [ p θ 1 : p θ 2 ] = 1 β 1 e 1 β 1 α J F , α θ 1 : θ 2 1 .
Thus, we get the following theorem:
Theorem 2.
For α > 0 , α 1 , β 1 , the Sharma–Mittal divergence between two zeta distributions p s 1 and p s 2 is
D α , β [ p s 1 : p s 2 ] = 1 β 1 ζ ( α s 1 + ( 1 α ) s 2 ) ζ ( s 1 ) α ζ ( s 2 ) 1 α 1 β 1 α 1 .

3. The Kullback–Leibler Divergence between Two Zeta Distributions

It is well-known that the KLD between two probability mass functions of an exponential family amounts to a reverse Bregman divergence induced by the cumulant function [32]: D KL [ p s 1 : p s 2 ] = B F * ( θ 1 : θ 2 ) : = B F ( θ 2 : θ 1 ) (with θ 1 = s 1 and θ 2 = s 2 ). Furthermore, this Bregman divergence amounts to a Fenchel–Young divergence [33] so that we have
D KL [ p s 1 : p s 2 ] = B F ( θ 2 : θ 1 ) = F ( θ ( s 2 ) ) + F * ( η ( s 1 ) ) θ ( s 2 ) η ( s 1 ) ,
where F * ( η ) denotes the Legendre convex conjugate of F, θ ( s ) = s and η ( s ) = F ( θ ( s ) ) = E p s [ t ( x ) ] = E p s [ log x ] , see [7]. Moreover, the convex conjugate F * ( η ( s ) ) corresponds to the negentropy [34]: F * ( η ( s ) ) = H [ p s ] , where the entropy of a zeta distribution p s is defined by:
H [ p s ] : = i = 1 p s ( i ) log 1 p s ( i ) .
Using the fact that i = 1 p s ( i ) = 1 = i = 1 1 i s ζ ( s ) , we can express the entropy as follows:
H [ p s ] = i = 1 1 i s ζ ( s ) log i s + log ( ζ ( s ) ) i = 1 1 i s ζ ( s ) , = i = 1 1 i s ζ ( s ) log ( i s ζ ( s ) ) .
Since F ( θ ) = log ζ ( θ ) , we have η ( θ ) = F ( θ ) = ζ ( θ ) ζ ( θ ) . The function ζ ( θ ) ζ ( θ ) has been tabulated in [35] (page 400). Notice that the maximum likelihood estimator [7] of n independently and identically distributed observations x 1 , , x n is η ^ = 1 n i = 1 n t ( x i ) . Thus we have:
η ^ = ζ ( θ ^ ) ζ ( θ ^ ) = 1 n i = 1 n log x i .
The inverse of the zeta function ζ 1 ( · ) has been studied in [36].
Proposition 1
(KLD between zeta distributions). The Kullback–Leibler divergence between two zeta distributions can be written as:
D KL [ p s 1 : p s 2 ] = log ( ζ ( s 2 ) ) H [ p s 1 ] + s 2 E p s 1 [ log x ] , = log ( ζ ( s 2 ) ) i = 1 1 i s 1 ζ ( s 1 ) log ( i s 1 ζ ( s 1 ) ) s 2 ζ ( s 1 ) ζ ( s 1 ) .
Moreover, the logarithmic derivative of the zeta function can be expressed using the von Mangoldt function [37] (page 1850) for θ > 1 :
η ( θ ) = ζ ( θ ) ζ ( θ ) = i = 1 Λ ( i ) i θ ,
where Λ ( i ) = log p is i = p k for some prime p and integer k 1 and 0 otherwise. Notice that the zeta function can be calculated using Euler product formula: ζ ( θ ) = p : prime 1 1 p θ .
Theorem 3.
The Kullback–Leibler divergence between two zeta distributions can be expressed using the real zeta function ζ and the von Mangoldt function Λ as:
D KL [ p s 1 : p s 2 ] = log ( ζ ( s 2 ) ) i = 1 1 i s ζ ( s ) log ( i s ζ ( s ) ) + s 2 i = 1 Λ ( i ) i s 1 .
Example 4.
Consider s 1 = 4 and s 2 = 12 . Letting 1 ϵ = 0.9999 and using Corollary 1, we obtain
D KL [ p s 1 : p s 2 ] D 1 ϵ [ p s 1 : p s 2 ] = 0.430479743738878
Let us now calculate the KLD using Theorem 3; we obtain log ( ζ ( s 2 ) ) = log 691 π 1 2 638512875 , H [ p s 1 ] 0.3337829096182664 (using 100 terms), and η ( s 1 ) = 0.06366938697034288 (using 100 terms) so that we have
D KL [ p s 1 : p s 2 ] = log ( ζ ( s 2 ) ) i = 1 1 i s ζ ( s ) log ( i s ζ ( s ) ) + s 2 i = 1 Λ ( i ) i s 1 ,
0.430495790304827
It is well-known that the KLD between two arbitrarily close zeta distributions p s and p s + d s amounts to half of the quadratic distance induced by the Fisher information:
D KL [ p s : p s + d s ] 1 2 I ( s ) d s 2 ,
where
I ( s ) = E p s [ ( log p s ( x ) ) 2 ] = E p s [ ( log p s ( x ) ) ] ,
where the first-order and second-order derivatives are taken with respect to the parameter s. Thus, for uniorder exponential families, the Fisher information matrix is
I ( s ) = E p s [ ( log p s ( x ) ) ] = ( log ζ ( s ) ) = ζ ( s ) ζ ( s ) ζ ( s ) 2 ζ 2 ( s ) .
This second-order derivative ( log ζ ( s ) ) has been studied in [38]. We have
I ( s ) = n = 1 Λ ( n ) log ( n ) n s
where Λ ( n ) is the Von Mangoldt function.

4. Comparison of the Zeta Family with a Pareto Subfamily

The zeta distribution is also called the “pure power-law distribution” in the literature [2].
We can compute the α -divergences between two Pareto distributions q s 1 and q s 2 with fixed scale 1 and respective shapes s 1 1 and s 2 1 . The Pareto density writes q s ( x ) = s 1 x s for x X = ( 1 , ) . The family of such Pareto distributions forms a continuous exponential family with natural parameter θ = s , sufficient statistic t ( x ) = log ( x ) , and convex cumulant function F ( θ ) = log ( θ 1 ) for θ Θ = ( 1 , ) . Thus we have [28]:
I α [ q 1 : q 2 ] = q s 1 ( x ) α q s 2 ( x ) 1 α d x = exp ( J F , α ( θ 1 : θ 2 ) ) ,
= α s 1 + ( 1 α ) s 2 s 1 α s 2 1 α ,
and we obtain the following closed form for the α -divergences between two Pareto distributions q s 1 and q s 2 :
D α [ q s 1 : q s 2 ] = 1 α ( 1 α ) 1 α s 1 + ( 1 α ) s 2 s 1 α s 2 1 α .
The moment parameter is η ( θ ) = F ( θ ) = 1 θ 1 so that θ ( η ) = 1 1 η and F * ( η ) = θ ( η ) η F ( θ ( η ) ) = η 1 log ( η ) . It follows that the KLD is
D KL [ q s 1 : q s 2 ] = B F ( θ 2 : θ 1 ) = log s 1 1 s 2 1 + s 2 s 1 s 1 1 .
The differential entropy of the Pareto distribution q s is
h [ q s ] = 1 q s ( x ) log q s ( x ) d x = F * ( η ( s ) )
with η ( s ) = 1 s 1 . We find that
h [ q s ] = 1 + 1 s 1 log ( s 1 ) .
Example 5.
For comparison, we calculate the KLD between two Pareto distributions with parameters s 1 = 4 and s 2 = 12 . We find
D KL [ q s 1 : q s 2 ] = log 3 11 + 8 3 1.367383682536406

5. Conclusions

Table 1 compares the discrete exponential family of zeta distributions with the continuous exponential family of Pareto distributions with fixed scale 1.
In general, it is interesting to consider discrete counterparts of continuous exponential families. For example, the discrete Gaussian distributions or discrete normal distributions defined as maximum entropy distributions have been studied in [39,40]. The log-normalizer or cumulant function of the discrete Gaussian distributions are related to the Riemann theta function [41]. Given a prescribed sufficient statistics t ( x ) , we may define the continuous exponential family with respect to the Lebesgue measure μ as the probability density functions p ( x ) maximizing the differential entropy under the moment constraint E p [ t ( x ) ] = η . The corresponding discrete exponential family is obtained by the distributions with probability mass functions maximizing Shannon entropy under the moment constraint E p [ t ( x ) ] = η .
Additional material is available online at https://franknielsen.github.io/ZetaParetoExpFam/index.html (accessed on 18 October 2022).

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Kotz, S.; Balakrishnan, N.; Read, C.; Vidakovic, B. Encyclopedia of Statistical Sciences; Wiley: Hoboken, NJ, USA, 2005; Volume 15. [Google Scholar]
  2. Goldstein, M.L.; Morris, S.A.; Yen, G.G. Problems with fitting to the power-law distribution. Eur. Phys. J. Condens. Matter Complex Syst. 2004, 41, 255–258. [Google Scholar] [CrossRef] [Green Version]
  3. Titchmarsh, E.C.; Heath-Brown, D.R.; Titchmarsh, E.C.T. The Theory of the Riemann Zeta-Function; Oxford University Press: Oxford, UK, 1986. [Google Scholar]
  4. Tempesta, P. Group entropies, correlation laws, and zeta functions. Phys. Rev. E 2011, 84, 021121. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Iwaniec, H. Lectures on the Riemann Zeta Function; American Mathematical Society: Providence, RI, USA, 2014; Volume 62. [Google Scholar]
  6. Nielsen, F. A note on some information-theoretic divergences between Zeta distributions. arXiv 2021, arXiv:2104.10548. [Google Scholar]
  7. Barndorff-Nielsen, O. Information and Exponential Families in Statistical Theory; John Wiley & Sons: Hoboken, NJ, USA, 2014. [Google Scholar]
  8. Apéry, R. Irrationalité de ζ(2) et ζ(3). Astérisque 1979, 61, 1. [Google Scholar]
  9. Rivoal, T. La fonction zêta de Riemann prend une infinité de valeurs irrationnelles aux entiers impairs. C. R. de l’Académie des Sci.-Ser. I-Math. 2000, 331, 267–270. [Google Scholar] [CrossRef] [Green Version]
  10. Fischler, S.; Sprang, J.; Zudilin, W. Many odd zeta values are irrational. Compos. Math. 2019, 155, 938–952. [Google Scholar] [CrossRef] [Green Version]
  11. Graham, R.L.; Knuth, D.E.; Patashnik, O. Concrete Mathematics: A Foundation for Computer Science; Addison-Wesley Professional: Boston, MA, USA, 1994. [Google Scholar]
  12. Hiary, G.A. Fast methods to compute the Riemann zeta function. Ann. Math. 2011, 174, 891–946. [Google Scholar] [CrossRef] [Green Version]
  13. Johansson, F. Rigorous high-precision computation of the Hurwitz zeta function and its derivatives. Numer. Algorithms 2015, 69, 253–270. [Google Scholar] [CrossRef] [Green Version]
  14. Yildirim, C.Y. A Note on ζ′′(s) and ζ′′′(s). Proc. Am. Math. Soc. 1996, 124, 2311–2314. [Google Scholar] [CrossRef] [Green Version]
  15. Powers, D.M. Applications and explanations of Zipf’s law. In New Methods in Language Processing and Computational Natural Language Learning; ACL anthology: Cambridge, MA, USA, 1998. [Google Scholar]
  16. Mandelbrot, B. Information Theory and Psycholinguistics: A Theory of Word Frequencies, Readings in Mathematical Social Sciences; MIT Press: Cambridge, MA, USA, 1966. [Google Scholar]
  17. Lovričević, N.; Pečarić, D.; Pečarić, J. Zipf–Mandelbrot law, f-divergences and the Jensen-type interpolating inequalities. J. Inequalities Appl. 2018, 2018, 36. [Google Scholar] [CrossRef]
  18. Naldi, M. Approximation of the truncated Zeta distribution and Zipf’s law. arXiv 2015, arXiv:1511.01480. [Google Scholar]
  19. Hu, C.Y.; Iksanov, A.M.; Lin, G.D.; Zakusylo, O.K. The Hurwitz zeta distribution. Aust. N. Z. J. Stat. 2006, 48, 1–6. [Google Scholar] [CrossRef]
  20. Deluca, A.; Corral, Á. Fitting and goodness-of-fit test of non-truncated and truncated power-law distributions. Acta Geophys. 2013, 61, 1351–1394. [Google Scholar] [CrossRef]
  21. Nielsen, F. Statistical Divergences between Densities of Truncated Exponential Families with Nested Supports: Duo Bregman and Duo Jensen Divergences. Entropy 2022, 24, 421. [Google Scholar] [CrossRef] [PubMed]
  22. Saito, S.; Tanaka, T. A note on infinite divisibility of zeta distributions. Appl. Math. Sci. 2012, 6, 1455–1461. [Google Scholar]
  23. Wang, T.; Zhang, W.; Maunder, R.G.; Hanzo, L. Near-capacity joint source and channel coding of symbol values from an infinite source set using Elias gamma error correction codes. IEEE Trans. Commun. 2013, 62, 280–292. [Google Scholar] [CrossRef] [Green Version]
  24. Oosawa, T.; Matsuda, T. SQL injection attack detection method using the approximation function of zeta distribution. In Proceedings of the 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC), San Diego, CA, USA, 5–8 October 2014; pp. 819–824. [Google Scholar]
  25. Doray, L.G.; Luong, A. Quadratic distance estimators for the zeta family. Insur. Math. Econ. 1995, 16, 255–260. [Google Scholar] [CrossRef]
  26. Jung, H.; Phoa, F.K.H. A Mixture Model of Truncated Zeta Distributions with Applications to Scientific Collaboration Networks. Entropy 2021, 23, 502. [Google Scholar] [CrossRef]
  27. Cichocki, A.; Amari, S.i. Families of alpha-beta-and gamma-divergences: Flexible and robust measures of similarities. Entropy 2010, 12, 1532–1568. [Google Scholar] [CrossRef] [Green Version]
  28. Nielsen, F.; Boltz, S. The Burbea-Rao and Bhattacharyya centroids. IEEE Trans. Inf. Theory 2011, 57, 5455–5466. [Google Scholar] [CrossRef] [Green Version]
  29. Nielsen, F. An information-geometric characterization of Chernoff information. IEEE Signal Process. Lett. 2013, 20, 269–272. [Google Scholar] [CrossRef] [Green Version]
  30. Sharma, B.D.; Mittal, D.P. New non-additive measures of entropy for discrete probability distributions. J. Math. Sci 1975, 10, 28–40. [Google Scholar]
  31. Nielsen, F.; Nock, R. A closed-form expression for the Sharma–Mittal entropy of exponential families. J. Phys. Math. Theor. 2011, 45, 032003. [Google Scholar] [CrossRef] [Green Version]
  32. Azoury, K.S.; Warmuth, M.K. Relative loss bounds for on-line density estimation with the exponential family of distributions. Mach. Learn. 2001, 43, 211–246. [Google Scholar] [CrossRef] [Green Version]
  33. Nielsen, F. On geodesic triangles with right angles in a dually flat space. In Progress in Information Geometry; Springer: Berlin/Heidelberg, Germany, 2021; pp. 153–190. [Google Scholar]
  34. Nielsen, F.; Nock, R. Entropies and cross-entropies of exponential families. In Proceedings of the 2010 IEEE International Conference on Image Processing, Washington, DC, USA, 11–12 November 2010; pp. 3621–3624. [Google Scholar]
  35. Walther, A. Anschauliches zur Riemannschen zetafunktion. Acta Math. 1926, 48, 393–400. [Google Scholar] [CrossRef]
  36. Kawalec, A. The inverse Riemann zeta function. arXiv 2021, arXiv:2106.06915. [Google Scholar]
  37. Weisstein, E.W. CRC Concise Encyclopedia of Mathematics; CRC press: Boca Raton, FL, USA, 2002. [Google Scholar]
  38. Stopple, J. Notes on log(ζ(s))′′. Rocky Mt. J. Math. 2016, 46, 1701–1715. [Google Scholar]
  39. Agostini, D.; Améndola, C. Discrete Gaussian distributions via theta functions. SIAM J. Appl. Algebra Geom. 2019, 3, 1–30. [Google Scholar] [CrossRef] [Green Version]
  40. Nielsen, F. The Kullback–Leibler Divergence Between Lattice Gaussian Distributions. J. Indian Inst. Sci. 2022, 1–12. [Google Scholar] [CrossRef]
  41. Deconinck, B.; Heil, M.; Bobenko, A.; Van Hoeij, M.; Schmies, M. Computing Riemann theta functions. Math. Comput. 2004, 73, 1417–1442. [Google Scholar] [CrossRef]
Figure 1. Plot of F ( θ ) = log ζ ( θ ) , a strictly convex and analytic function.
Figure 1. Plot of F ( θ ) = log ζ ( θ ) , a strictly convex and analytic function.
Psf 05 00002 g001
Table 1. Comparisons between the Zeta family and the Pareto subfamily. The function ζ ( s ) is the real zeta function.
Table 1. Comparisons between the Zeta family and the Pareto subfamily. The function ζ ( s ) is the real zeta function.
Zeta DistributionPareto Distribution
Univariate Uni-Order Exponential Family exp ( θ t ( x ) F ( θ ) )
Discrete EFContinuous EF
PMF/PDF p s ( x ) = 1 x s ζ ( s ) q s ( x ) = s 1 x s
Support X N = { 1 , 2 , } ( 1 , )
Natural parameter θ ss
Cumulant F ( θ ) log ζ ( θ ) log ( θ 1 )
Sufficient statistic t ( x ) log x log x
Moment parameter η ζ ( θ ) ζ ( θ ) 1 s 1
Conjugate F * ( η ) H [ p s ] = i = 1 1 i s ζ ( s ) log ( i s ζ ( s ) ) η 1 log ( η )
Maximum likelihood estimator η ^ = ζ ( θ ^ ) ζ ( θ ^ ) = 1 n i = 1 n log x i s ^ = n i = 1 n log x i
Fisher information i = 0 Λ ( i ) log ( i ) i s 1 ( s 1 ) 2
Entropy F * ( η ( s ) ) i = 1 1 i s ζ ( s ) log ( i s ζ ( s ) ) 1 + 1 s 1 log ( s 1 )
Bhattacharyya coefficient I α ζ ( α s 1 + ( 1 α ) s 2 ) ζ ( s 1 ) α ζ ( s 2 ) 1 α α s 1 + ( 1 α ) s 2 s 1 α s 2 1 α
Kullback-Leibler divergence log ( ζ ( s 2 ) ) i = 1 1 i s ζ ( s ) log ( i s ζ ( s ) ) s 2 ζ ( s 1 ) ζ ( s 1 ) log s 1 1 s 2 1 + s 2 s 1 s 1 1
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Nielsen, F. Comparing the Zeta Distributions with the Pareto Distributions from the Viewpoint of Information Theory and Information Geometry: Discrete versus Continuous Exponential Families of Power Laws. Phys. Sci. Forum 2022, 5, 2. https://doi.org/10.3390/psf2022005002

AMA Style

Nielsen F. Comparing the Zeta Distributions with the Pareto Distributions from the Viewpoint of Information Theory and Information Geometry: Discrete versus Continuous Exponential Families of Power Laws. Physical Sciences Forum. 2022; 5(1):2. https://doi.org/10.3390/psf2022005002

Chicago/Turabian Style

Nielsen, Frank. 2022. "Comparing the Zeta Distributions with the Pareto Distributions from the Viewpoint of Information Theory and Information Geometry: Discrete versus Continuous Exponential Families of Power Laws" Physical Sciences Forum 5, no. 1: 2. https://doi.org/10.3390/psf2022005002

Article Metrics

Back to TopTop