Maximum Entropy Criterion for Moment Indeterminacy of Probability Densities

We deal with absolutely continuous probability distributions with finite all-positive integer-order moments. It is well known that any such distribution is either uniquely determined by its moments (M-determinate), or it is non-unique (M-indeterminate). In this paper, we follow the maximum entropy approach and establish a new criterion for the M-indeterminacy of distributions on the positive half-line (Stieltjes case). Useful corollaries are derived for M-indeterminate distributions on the whole real line (Hamburger case). We show how the maximum entropy is related to the symmetry property and the M-indeterminacy.


Introduction
When studying probability distributions, one of the challenging questions we arrive at comes from the classical moment problem.The question is whether or not a probability distribution is uniquely determined by the sequence of all moments, assuming they are finite.The answer can be given for the distribution itself; equivalently, for the associated random variable X; its distribution function F; the density f = F ′ ; or the bounded positive measure µ = µ F induced by F. Thus, if the answer is positive, we call the distribution (also X, F, f , µ) M-determinate; otherwise, we call it M-indeterminate.(Here, 'M' stands for 'Moment'.) It is well known that if µ is M-indeterminate, then there are infinitely many absolutely continuous distributions, infinitely many purely discrete distributions, and infinitely many singular distributions, all having the same moments as µ.
It is important, from both the theoretical and applied points of view, to have criteria at hand allowing to specify/identify the determinacy or indeterminacy property of a distribution.The best is to work with conditions which are in the group 'checkable conditions'; comments and references are given at the end of our paper.There is another group of 'non-checkable conditions'.Here are the well-known classical necessary and sufficient conditions for the (in)determinacy of µ in terms of the limits of the smallest eigenvalues of sequences of Hankel matrices.Our recent review paper [1] describes the whole spectrum, called 'a bunch', of the fundamental results, old and recent.The reader will find in [1] details about the great contributions of T. Stieltjes, H. Hamburger, N. Akhiezer, M. Krein, C. Berg, K. Schmüdgen, M. Putinar, B. Simon, and others.Their works are widely known.
Developments over the last few decades have shown the efficiency of involving the Principle of Maximum Entropy, see, for example, [2][3][4].We also use the terms 'maximum entropy approach' and 'maximum entropy method'.For 'maximum entropy', we write the traditional 'MaxEnt'.
The idea of the MaxEnt method consists in selecting a distribution which possesses maximum uncertainty, and at the same time, fulfills the restrictions imposed by the known information.
In general, it is more delicate to deal with M-indeterminate distributions, since we need, for example, to know how to find, describe and work with an infinite family of distributions all having the same moments.In any case, MaxEnt may help to shed light on this 'dark tunnel'.
In this paper, we follow the generally accepted terminology and notations as used in probability theory.We write X ∼ F for a random variable X whose distribution function is F, with f = F ′ being the density, and specify the range U of values of X, the support of F, which is assumed to be unbounded.Only in this case can the 'interesting' property of M-indeterminacy appear.We work with the moment sequence {m k } ∞ k=0 , and if For X being an absolutely continuous random variable with strictly positive density f , we are looking for conditions, or criteria, guaranteeing the M-determinacy or M-indeterminacy of µ.We use the entropy (called also 'differential entropy'), which is denoted by h f and defined as follows: The idea is very natural: We start with the n-truncated moment set {m k } n k=0 , and based on it, we find the MaxEnt approximant f n of f and study the limit of the entropy h f n of f n as n → ∞.
There is a remarkable fact, namely, that there are only two possibilities for the 'value' of the limit lim n→∞ h f n ; either it is a finite number, or it is 'equal' to −∞.Depending on this limit, we decide that f is M-determinate or M-indeterminate.
It is relevant to mention one of the results proved in ( [5], Theorem 1): if an absolutely continuous distribution F with density f is M-determinate, then the sequence of MaxEnt approximants is converging in entropy to f .One of our goals in this paper is to involve additional arguments allowing to show that such a result on entropy convergence can be extended to the case of M-indeterminate distributions.
The remainder of the paper is organized as follows.In Section 2, we recall briefly what we need about Hankel matrices and introduce the MaxEnt setup.In Section 3, we calculate the entropy of densities with the given n-truncated moment set, for fixed n, and also for the entire moment sequence {m k } ∞ k=0 .In Section 4, we provide an M-indeterminacy MaxEnt criterion in the Stieltjes case.In Section 5, we present corollaries related to the M-indeterminacy in the Hamburger case.Discussed is the question: Among a family of infinitely many densities all with the same moments, which density has the largest entropy?

Basics of Hankel Matrices and the MaxEnt Setup
When we tell that {m k } ∞ k=0 with m 0 = 1 is a moment sequence, it always means that there is a probability measure µ = µ F which is 'behind'.Thus, think of a random variable X defined in an underlying probability space (Ω, F , P), taking values in a set U ⊂ R. If F is its distribution function, F(x) := P[X ≤ x], x ∈ R, then µ = µ F is a positive Borel measure induced by F. We write just µ.
A basic assumption is that E[|X| k ] < ∞ for all k = 1, 2, . . . .Thus, well defined are the moments In what follows, we involve and use the entropy of the strictly positive density f under the constraint of knowing only the n-truncated moment set {m k } n k=0 .We will see that the MaxEnt formalism allows to study in parallel both the Hamburger and the Stieltjes cases; hence, we assume that the distributions and their densities have support U = R or U = R + .
Let us consider the Stieltjes case.For a density f with n-truncated moment set {m k } n k=0 , there is a density, say f n , satisfying two properties: (a) The 'first' n moments of f n are exactly {m k } n k=0 ; (b) f n maximizes the Shannon entropy.
It is well known, see [2], that where (λ 0 , ..., λ n ) are the Lagrange multipliers satisfying the constraints In this case, we use the simple notation h f n for the entropy of f n , and remember that h f n depends on the moments {m k } n k=0 .It is easy to see that We would like the sequence of approximants { f n } and the entropy sequence {h f n } to be well defined for any n = 1, 2, . . . .It may happen, see ([6], Theorem 1), that for given f and {m k } n k=0 , the desired density f n does not exist, in which case the quantity h f n is meaningless.However, in the cited paper, the following useful relation is established (the class D n is defined at the end of this section): sup f ∈D n h f = h f n−1 , even if the MaxEnt approach does not apply.Since the entropy is monotone and non-increasing as n increases, the latter equality enables us to simply set h f n = h f n−1 , thus filling the 'gap' left by the non-existing densities f n .This justifies the assumption made in the sequel, without loss of generality, that all entries of the monotone non-increasing entropy sequence {h f n } ∞ n=1 are well defined.
In the non-symmetric Hamburger case, once the n-truncated moment set {m k } n k=0 for even n is assigned, the positivity of the Hankel determinant D n,0 = D n,0 (m 0 , . . ., m n ) guarantees the existence of a MaxEnt solution, see ( [6], Appendix A).As a consequence, the entire entropy sequence {h f n } ∞ n=1 is defined.In the symmetric Hamburger problem, the MaxEnt density existence is guaranteed under conditions similar to those in the Stieltjes case.Now suppose that {m k } n+1 k=0 is a moment set for which we 'keep fixed' (unchanged) the moments {m 0 , m 1 , . . ., m n }, while we treat as 'varying continuously' the moment m n+1 .If letting b := m n+1 , then the (n + 1)-truncated moment set can be written as {m 0 , m 1 , . . ., m n , b}.Moreover, the existence conditions for a solution of the moment prob-lem require the Hankel determinants to be positive.This is guaranteed by imposing b to have a lower bound, say b − p,n+1 , which is the unique real number satisfying the equation As well, due to the MaxEnt machinery, see ( [6], Appendix A), in the Stieltjes case, the following value of b has to be considered: Recall that we deal with the truncated (n + 1)moment set in which the parameter b stands for the (n + 1)th moment: In the Stieltjes case, we introduce the following classes of densities: Similar notions can be introduced also in the Hamburger case, just replacing R + with R.
The class D n is a convex set for each n, and then D ∞ = ∞ 1 D n is also convex.We know that in the M-indeterminate case, D ∞ contains 'infinitely many' densities, all being solutions of the same moment problem.
For both the Hamburger and Stieltjes cases, we need to recall a few known facts which will be essentially used later.Fact 1.We are going to work the moment sequence {m k } ∞ k=0 whose underlying density f has entropy h f such that either h f is finite, or h f = −∞.To be precise, distributions with h f = +∞ are not allowed.The reason for this is that once {m k } ∞ k=0 is assigned, the 'option' h f = +∞ is not feasible since it is well known in the MaxEnt setup that h f ≤ h f 2 , where ] is finite because of Lyapunov's inequality m 2 − (m 1 ) 2 ≥ 0 (Hamburger case) and h f ≤ h f 1 = 1 + ln m 1 is finite for every m 1 > 0 (Stieltjes case).Fact 2. Once the moment set {m k } n k=0 is given and f n is the corresponding MaxEnt density, the entropy sequence {h f n } ∞ n=1 is monotone non-increasing, and its limit is either finite or −∞.
Fact 3.For consistency between the differential entropy of a continuous random variable and the entropy of its discretization, the differential entropy of any discrete measure which can be compared with Dirac's deltas set is assumed to be −∞, see ( [3], pp.247-249).Fact 4. If the density f is bounded, this is sufficient to eliminate the option

Entropy of Densities Which Are M-Indeterminate
The MaxEnt formalism allows to treat both Hamburger and Stieltjes cases in a similar way.For the sake of brevity, we confine ourselves mainly to discussions on the Stieltjes case.All arguments can then be easily extended to the Hamburger case.This possibility is one of the advantages of involving the MaxEnt machinery.

Entropy of Densities from the Class D n
We start with the formulation and the proof of the following result.
Theorem 1. Suppose that {m k } ∞ k=0 , m 0 = 1, is the full moment sequence of a given density f .For fixed n, based on the n-truncated moment set {m k } n k=0 , we consider f n , the MaxEnt approximant of f , and let h f n be the entropy of f n .Then, there are infinitely many densities g ∈ D n whose entropy h g is spanning an interval, namely Proof.We provide arguments in both cases, Stieltjes and Hamburger.
(a) Stieltjes case.Preliminarily, for fixed n, let us consider f n and the upper bound b + n+1 of its (n + 1)th order moment.It was mentioned that, in general, b + n+1 is different from m n+1 .Our goal is to specify the range of values of the entropy h g , where g is an arbitrary density from the class D n .For this, we introduce the following suitable subclass E n ⊂ D n : Notice that we rely here on the specific truncated moment set (m 0 , m 1 , ..., m n , b − p,n+1 ) ∈ ∂(m(D n+1 )), the boundary of the moment space.Equivalently, the elements of E n are MaxEnt densities which are constrained by (m 1 , . . ., m n , b); they belong to D n and, primarily, they all have analytically tractable entropy.The latter property enables us to calculate the entropy of all g ∈ D n by evaluating the entropy of all g ∈ E n .
Let us consider f n+1 for b varying in the interval (b − p,n+1 , b + n+1 ] and calculate the values spanned by the entropy h f n+1 .
Subcase a1.If b = b + n+1 , the right-end point, it is easy to verify that f n+1 has a Lagrange multiplier λ n+1 = 0 and hence f n+1 coincides with f n ; hence, Subcase a2.If b is 'close' to the left-end point, i.e., b → b − p,n+1 , we look at the Hankel determinants D n,0 and D n,1 and see that either D n,0 → 0 or D n,1 → 0. This implies that the underlying measure µ is discrete; see, for example, ([7], Theorem 1.3, p. 6).Therefore, the entropy quantity h f n+1 is approaching −∞: It remains to mention an essential property of the entropy h f n+1 as a function of the variable b.Since , Equation (2.73), p. 47), or the arguments below, we have that h f n+1 is monotone increasing with respect to b ∈ (b − p,n+1 , b + n+1 ].(b) Hamburger case.The arguments here are similar to those above.We need to replace E n by the following one with analogous meaning of all notations: Here, b − 0,n+2 and b + n+2 are such that In such a case, it is easy to see that and hence, just as above, we conclude that f n+2 satisfies the entropy relation h f n+2 = h f n .Joining together ( 2) and ( 3) (with obvious extension to the Hamburger case) with f n+1 or f n+2 and referring to the monotone increasing of the entropy with respect to b, we conclude that indeed there are infinitely many densities f n+1 and f n+2 ∈ E n whose entropy spans the interval in (1), with this property holding for all g ∈ D n .Theorem 1 is proved.

Entropy of Densities from the Class D ∞
Among the well-known properties of Shannon's entropy, we use its concavity as a functional, which implies that the entropy of all densities g ∈ D ∞ can be calculated.
We start with the Stieltjes moment sequence {m k } ∞ k=0 and calculate the entropy sequence {h f n } ∞ n=1 , which is monotone, non-increasing and convergent.Similarly, for a Hamburger moment sequence {m k } ∞ k=0 , we calculate the entropy sequence {h f n } ∞ n=2 , being also monotone, non-increasing and convergent.
Let us show first that there exists only one density, say, f * ∈ D ∞ such that f * has the largest entropy, i.e., Indeed, the set of solutions to the S-indeterminate moment problem includes infinitely many densities, previously grouped in the convex set D ∞ .On the other hand, the continuous entropy functional h : g → h g = − U (ln g(x))g(x) dx is strictly concave and, over the convex set D ∞ , h g attains its maximum value.Hence, we have that the optimization problem to maximize h g over g ∈ D ∞ has indeed a unique solution f * such that h f * := max g∈D ∞ h g .
Relying on Theorem 1, we are ready to calculate the entropy of all moment equivalent densities g ∈ D ∞ .We keep in mind, all densities in the class D ∞ have support R + in the Stieltjes case and R in the Hamburger case.
Theorem 2. Suppose that {m k } ∞ k=0 is the full Stieltjes moment sequence of a density f and it is known that f is M-indeterminate.We use f n and h f n as before.Then, there are infinitely many densities g ∈ D ∞ whose entropy h g is spanning an interval, namely, Proof.Note first that each g ∈ D ∞ satisfies g ∈ ∞ n=1 D n ; hence, according to Theorem 1, g has entropy which completes the proof.
We use below, for example, S-determinate or H-determinate, meaning that a density is on R + or on R, and it is M-determinate or H-determinate.This similarly applies for S-indeterminacy and H-indeterminacy.
In general, it is not easy to establish the S-determinacy, and hence S-indeterminacy, through known criteria based on necessary and sufficient conditions.The existence of the density f * with the largest entropy, see Theorem 1, indicates that there is some similarity between the M-determinate and M-indeterminate cases.Consequently, since initially a finite set of moments is involved, the technique of density reconstruction through the MaxEnt approach can be applied without distinguishing these two cases.
With f , {m 1 , . . ., m n }, f n , h f n , all as above, we give here some details.First, if f is S-determinate and H-determinate, the sequence of approximants { f n } ∞ n=1 converges in entropy to a unique underlying density f , see ( [5], Section 3), that is, all being finite.However, from the used procedure, relying on the geometrical meaning of Theorem 2.19, p. 72 in [7], it is immediate to deduce that the statement of convergence in entropy is equally extended to the case where Second, if f is S-indeterminate, the entropy sequence {h f n } ∞ n=1 is monotone nonincreasing and hence convergent with lower bound h * , i.e.
It is useful to mention that Theorem 2 and the comments completely agree with the rationale of the MaxEnt approach: when all known information has been taken into account, a system with maximum entropy is the most probable state because it is the system in which the least amount of information has been defined.
Moreover, Theorem 2 justifies the approach of reconstruction of the density f , starting from a finite set of moments and passing to the full moment sequence, regardless of the M-determinacy or M-indeterminacy of f .In any case, that issue is not really of great practical significance.In fact, a full set of moments will 'never' be available; hence, for practical purposes, we deal only with finite n, which is perhaps 'big enough'.Nevertheless, f n is a valuable approximation of f * since both f n and f * have the same first n moments and h f n > h f * .This fact also corresponds well to the MaxEnt rationale.Thus, the question of moment (in)determinacy of the density f is not essential for the procedure we follow.

Stieltjes Case: MaxEnt Criterion for M-Indeterminacy
We deal with a random variable X on R + , X ∼ F, f = F ′ with finite all moments {m k } ∞ k=1 .Recall that m 0 = 1.We mentioned in the Introduction one fundamental fact: if the distribution F is M-indeterminate, then there are infinitely many distributions of any kind, all having the same moments as F.
Recall that a Stieltjes moment sequence {m k } ∞ k=1 can also be considered a Hamburger moment sequence, i.e., it is coming from a random variable in R. We always have to make a distinction between M-determinacy and M-indeterminacy by specifying that it is in the sense of Stieltjes, or in the sense of Hamburger.We use below the obvious terms, S-determinate, S-indeterminate, H-determinate and H-indeterminate, in their short forms, S-det, S-indet, H-det and H-indet.Let us list the possibilities for the distribution F: If F is H-indet, then either F is also S-indet, or, it may look a little 'surprising', F is S-det.
Thus, we have three cases; they will be discussed below.Relying on the results in Section 3, we provide now a MaxEnt criterion for M-indeterminacy in the Stieltjes case.The Stieltjes moment sequence {m k } ∞ k=1 corresponds to the moments of infinitely many distributions on R + ; equivalently, the distribution F is M-indeterminate, if and only if the following relation holds: Proof.First, we recall the well-known result according to which if n is odd, the estimator f Besides the above sources, we can also refer to the notion Stieltjes class, S( f , h), introduced for any M-indeterminate distribution, see [11].Recall that where f = F ′ is the density of the M-indeterminate distribution F, and p(x), x ∈ R, called a 'perturbation function', is a sign function with norm ||p|| = 1, satisfying the 'vanishing moments' property, x k f (x)p(x) dx = 0, k = 0, 1, 2, . . .Another related recent work is [12].
It turns out that the MaxEnt technique enables us to make a further refinement of what we know about the symmetric solutions, also of the measures, which are M-indeterminate.Theorem 4. Suppose that F is an arbitrary distribution on R with finite moments and moment sequence {m 0 , 0, m 2 , 0, m 4 , ...} (Hamburger case) and that F is M-indeterminate.Then, the density f * , see Section 3.2, with the largest entropy, is symmetric.
Proof.Consider an arbitrary non-symmetric g = (g(x), x ∈ R) ∈ D ∞ .It is easy to verify that g = (g(−x), x ∈ R) is such that g ∈ D ∞ .Moreover, g and g have the same entropy, i.e., h g = h g .Consider the densities g * and g * for which the entropies h g and h g are maximal.They are both in the set D ∞ .Combining the above general statement with the uniqueness of the MaxEnt density g * , it follows that g * ≡ g * ; hence, g * is symmetric.

Brief Conclusions
In this paper, we establish a new criterion for the M-indeterminacy of a probability density on the positive half-line (Stieltjes case) by involving the MaxEnt approach.Interesting corollaries are derived for probability densities on the whole real line (Hamburger case).The obtained results are new and they can be considered a valuable addition to the results based on two groups of conditions called 'checkable' or 'uncheckable' for either the M-determinacy or M-indeterminacy of distributions.
The recent review paper [1] contains a comprehensive description of significant results based on 'uncheckable conditions', including two illustrations of how to use this kind of condition as an indication for a specific property of a distribution in terms of its moments.However, from the applied point of view, most useful are the results involving 'checkable conditions'.The reader is referred to the following sources: [10,13,14].The property 'M-indeterminacy', besides its non-triviality as a mathematical phenomenon, arises in important applied areas.Among them are atmospheric studies, gravity theory and quantum mechanics, see, for example, [15][16][17].The involvement of the MaxEnt technique may lead to challenging theoretical problems; however, the answers, when available, would shed additional light on the analysis of applied problems.

Theorem 3 .
(Main result.)Let f be a probability density with finite all moments.Denote by m := {m k } ∞ k=1 its full moment sequence and m n := {m k } n k=1 its nth truncated set.If m is considered as a Stieltjes moment sequence, we write f (S) n for the MaxEnt approximant of f based on m n .Similarly, f (H) n will stand for the MaxEnt approximant of f based on m n if considering m as a Hamburger moment sequence.For the entropy, we use the notations h f

.
2, . . ., and also the moment sequence {m k } ∞ k=0 .If U = R, we say that {m k } ∞ k=0 is a Hamburger moment sequence, while for U = R + , {m k } ∞ k=0 is a Stieltjes moment sequence.For any moment sequence {m k } ∞ k=0 , we define a few infinite sequences of Hankel matrices, namely, {H n } ∞ n=1 and {H n,p } ∞ n=1 , and their determinants , as follows: Suppose that F is a distribution function on R with moment sequence {m 0 , 0, m 2 , 0, m 4 , ...} (Hamburger case).Then, if F is M-indeterminate, symmetric and nonsymmetric solutions exist.
[6]ndoes not exist.Since the entropy is monotone and non-increasing as n increases, it is proved in[6](Section 3.2) that the entropy quantity h f