Lower Bounds on Multivariate Higher Order Derivatives of Differential Entropy

This paper studies the properties of the derivatives of differential entropy H(Xt) in Costa’s entropy power inequality. For real-valued random variables, Cheng and Geng conjectured that for m≥1, (−1)m+1(dm/dtm)H(Xt)≥0, while McKean conjectured a stronger statement, whereby (−1)m+1(dm/dtm)H(Xt)≥(−1)m+1(dm/dtm)H(XGt). Here, we study the higher dimensional analogues of these conjectures. In particular, we study the veracity of the following two statements: C1(m,n):(−1)m+1(dm/dtm)H(Xt)≥0, where n denotes that Xt is a random vector taking values in Rn, and similarly, C2(m,n):(−1)m+1(dm/dtm)H(Xt)≥(−1)m+1(dm/dtm)H(XGt)≥0. In this paper, we prove some new multivariate cases: C1(3,i),i=2,3,4. Motivated by our results, we further propose a weaker version of McKean’s conjecture C3(m,n):(−1)m+1(dm/dtm)H(Xt)≥(−1)m+11n(dm/dtm)H(XGt), which is implied by C2(m,n) and implies C1(m,n). We prove some multivariate cases of this conjecture under the log-concave condition: C3(3,i),i=2,3,4 and C3(4,2). A systematic procedure to prove Cl(m,n) is proposed based on symbolic computation and semidefinite programming, and all the new results mentioned above are explicitly and strictly proved using this procedure.

Let X be an n-dimensional random vector with finite variance and a probability density function p(x). For t > 0, define X t X + Z t , where Z t ∼ N n (0, tI) is an independent standard Gaussian random vector with the covariance matrix t × I. The probability density of X t is Thus, the heat equation holds for p t (x t ), i.e., dp t dt The differential entropy of X t is defined as Conjecture 1. The first derivative of H(X t ) (i.e., the Fisher information) is completely monotone in t, that is, Costa's EPI implies C 1 (1, n) and C 1 (2, n) [12], and Cheng-Geng proved C 1 (3, 1) and C 1 (4, 1) [16].
Let X G ∼ N n (µ, σ 2 I) be an n-dimensional Gaussian random vector and X Gt X G + Z t be the Gaussian X t . McKean [17] proved that X Gt achieves the minimum of (d/dt)H(X t ) and −(d 2 /dt 2 ) H(X t ) is subject to Var(X t ) = σ 2 + t, and conjectured the general case: Conjecture 2. The following inequality holds subject to Var(X t ) = σ 2 + t, McKean proved C 2 (1, 1) and C 2 (2, 1) [17]. Zhang-Anantharam-Geng [18] proved C 2 (3, 1), C 2 (4, 1) and C 2 (5, 1) if the probability density function of X t is log-concave. Note that C 2 (1, n) and C 2 (2, n) are immediate consequences of Entropy Power Inequality and Costa's concavity of entropy power result [12], respectively. In this paper, we notice that in the multivariate case, Conjecture 2 might not be true for m > 2 even under the log-concave condition, which motivates us to propose the following weaker conjecture: Conjecture 3. The following inequality holds subject to Var(X t ) = σ 2 + t, We see that Conjecture 3 coincides with Conjecture 2 for n = 1 (univariate case). Additionally, Conjecture 2 implies Conjecture 3 and Conjecture 3 implies Conjecture 1. The three conjectures give different lower bounds for the derivatives of (−1) m+1 H(X t ). [14,16] proved some cases of Conjecture 1 by writing the left-hand formula in Conjecture 1 as sums of squares and, hence, concluded their sign. We provide a systematic way to explore this idea using symbolic computation and semidefinite programming and prove several new results in the multivariate cases.

Remark 1. The authors in
Our procedure for proving C s (m, n) consists of three main ingredients. First, a systematic method is proposed to compute the constraints R i , i = 1, . . . , N 1 that are satisfied by p t (x t ) and its derivatives. The condition that p t is log-concave can also be reduced to a set of constraints, i.e., R j , j = 1, . . . , N 2 . Second, based on symbolic computation, proof for C s (m, n) is reduced to the following problem: where E, Q j , and S are polynomials in p t and its derivatives such that E represents the conjecture, Q j ≥ 0, and S is a sum of squares (SOS). Third, problem (7) can be solved with semidefinite programming (SDP) [19,20]. Note that from Equation (7), we can give an explicit and strict proof for C s (m, n).
In Table 1, we give the data for computing the SOS representation (7) using the Matlab software in Appendix A of [21], where Vars is the number of variables, and N 1 and N 2 are the numbers of constraints in (7). Table 1. Data in computing the SOS with symbolic computation and SDP. The procedure is inspired by the work of [12,14,16,18], and uses basic ideas introduced therein. The specific contributions in this paper are: (1) Based on symbolic computation and semidefinite programming, C s (m, n) can be automatically verified with the aid of the software systems Maple and Matlab, and analytical proofs for C s (m, n) can also be efficiently produced. (2) The new concept of differentially homogenous polynomials is introduced and used to reduce the computational complexity. Compared with the pure SDP-based approach (such as [18]), the computational efficiency of our procedure is, in general, much higher. See Procedure 2 for details. (3) The results in [16,18] are generalized from the univariate cases to the multivariate cases (new results). This is the first attempt for the multivariate high order cases of the conjectures. (4) In comparison to the literature (such as [12,15,16,18]), the constraints (integral or log-concave) considered in this paper are more general.

Proof Procedure
In this section, we provide a general procedure to prove C s (m, n) for specific values of s, m, and n.

Some Notations
To simplify the notations, we use p t to denote p t (x t ) in the rest of the paper. Denote to be the set of all derivatives of p t with respect to the differential operators ∂ ∂x i,t , i = 1, . . . , n and R[P n ] to be the set of polynomials in P n with coefficients in R. For v ∈ P n , let ord(v) be the order of v. For a monomial ∏ r i=1 v d i i with v i ∈ P n , its degree, order, and total order are defined as is called a kth-order differentially homogeneous polynomial or simply a kth-order differential form, if all its monomials have a degree of k and a total order of k. Let M k,n be the set of all monomials which have a degree of k and a total order of k. Then, the set of kth-order differential forms is an R-linear vector space generated by M k,n , which is denoted as Span R (M k,n ).
We will use Gaussian elimination in Span R (M k,n ) by treating the monomials as variables. We always use the lexicographic order for the monomials to be defined below unless mentioned otherwise. Consider two distinct derivatives v 1 = ∂ h p t ∂ h 1 x 1,t ···∂ hn x n,t and v 2 = ∂ s p t ∂ s 1 x 1,t ···∂ sn x n,t . We say v 1 > v 2 if h > s, or h = s, h l > s l and h j = s j for j = l + 1, . . . , n.
Consider the two distinct monomials where v i ∈ P n and v i < v j for i < j. We define m 1 > m 2 if d l > e l , and d i = e i for i = l + 1, . . . , r.
From (1), p t : R n+1 → R is a function in x t and t. Therefore, each polynomial f ∈ R[P n ] is also a function in x t and t, f (t) = R n f dx t is a function in t, and the expectation of f with respect to for all x t ∈ R n and t > 0.

Three Parts of the Proof
In this section, we give the procedure to prove C s (m, n), which consists of three parts.

Part I
In step 1, we reduce the proof of C s (m, n) into the proof of an integral inequality, as shown by the following lemma, whose proof will be given in Section 2.3: where E s,m,n = ∑ n a 1 =1 · · · ∑ n a m =1 E s,m,n,a m , a m = (a 1 , . . . , a m ), E s,m,n,a m is a 2mth-order differential form in R[P m,n ], and

Part II
In step 2, we compute the constraints which are relations satisfied by the probability density p t of X t . In this paper, we consider two types of constraints: integral constraints and log-concave constraints, which will be given in Lemmas 2 and 3, respectively. Since E s,m,n in (8) is a 2mth-order differential form, we need only the constraints which are 2mth-order differential forms.
A function f : R n → R is called log-concave if log f is a concave function. In this paper, by the log-concave condition, we mean that the density function p t is log-concave.

Definition 2.
An mth-order log-concave constraint is a 2mth-order differential form R in R[P n ] such that R ≥ 0 under the log-concave condition.
Note that T k 1 ,...,k l in (11) are not known. For convenience, denote where

Part III
In step 3, we give a procedure to write E s,m,n as an SOS under the constraints, the details of which will be given in Section 2.4. and where S is an SOS. If the log-concave condition is not needed, we may set Q j = 0 for all j.
To summarize the proof procedure, we have the following: Theorem 1. If Procedure 1 satisfies (13) and (14) for certain s, m, and n, then C s (m, n) is explicitly and strictly proved.
Proof. With Lemma 1, we have the following proof for C s (m, n): Equality S1 is true, because R i is an integral constraint by Lemma 2. By Lemma 3 and (14), P j Q j ≥ 0 is true under the log-concave condition, so inequality S2 is true under the log-concave condition. Finally, inequality S3 is true, because S ≥ 0 is an SOS.

Proof of Lemma 1
Costa [12] proved the following basic properties for p t and H(X t ), where and J(X t ) where is a 2mth-order differential form in R[P m,n ].
To prove Lemma 1 for s = 2, 3, we need to compute (d m /dt m )H(X Gt ). Let X G ∼ N n (µ, σ 2 I) be an n-dimensional Gaussian random vector and X Gt X G + Z t , where Z t ∼ N n (0, tI) is introduced in Section 1. Then, X Gt ∼ N n (µ, (σ 2 + t)I) and the probability density of X Gt is
We can now prove Lemma 1 for s = 2, 3. Let where E 1,m,n and E 0,m,n are from Lemmas 4 and 6, respectively. By Lemma 5, Together with Lemma 4, Lemma 1 is proved.

Main Result (Procedure 1)
In this section, we present the detailed Procedure 1, called Procedure 2, which is based on symbolic computation and the SOS theory. Procedure 2. Input: E s,m,n and R i , i = 1, . . . , N 1 are 2mth-order differential forms in R[P n ]; P j , j = 1, . . . , N 2 are 2k j th-order differential forms in in R[P n ].
Output: e i ∈ R and Q j ∈ Span R (M 2(m−k j ),n ) such that (13) and (14) are true, or fail meaning such that e i and Q j are not found. S1. Treat the monomials in M m,n as new variables m l , l = 1, . . . , N m,n , which are all the monomials in R[P n ] with the degree m and the total order m. We call m l m s a quadratic monomial.
S2. Write monomials in C m,n = {R i , i = 1, . . . , N 1 } as quadratic monomials if possible. By performing Gaussian elimination on C m,n by treating the monomials as variables and according to a monomial order such that a quadratic monomial is less than a non-quadratic monomial, we obtain C m,n = C m,n,1 ∪ C m,n,2 , where C m,n,1 is the set of quadratic forms in m i , C m,n,2 is the set of non-quadratic forms, and Span R (C m,n ) = Span R ( C m,n ). S3. There may exist relationships among the variables m i , which are called intrinsic , and m 3 = ( ∂p t ∂x 1,t ) 4 in M 4,n , an intrinsic constraint is m 1 m 3 − m 2 2 = 0. By adding the intrinsic constraints which are quadratic forms in m i to C m,n,1 , we obtain where q j,k are variables to be found later. LetR j be obtained from P j Q j by writing monomials in P j Q j as quadratic monomials in m i , and eliminating the non-quadratic monomials with C m,n,2 , such If an h j,l is not a quadratic form in m i , then deleteR j ; hence, theR j 's in quadratic form are selected. Then, denote these constraints as R j , j = 1, . . . , N 2 , which form the reduced set C m,n .
S5. Let E s,m,n be obtained from E s,m,n by eliminating the non-quadratic monomials using C m,n,2 such that E s,m,n − E s,m,n ∈ Span R (C m,n,2 ) ⊂ Span R (C m,n ).
S6. Since E s,m,n , R i , i = 1, . . . , N 3 and R j , j = 1, . . . , N 2 are quadratic forms in m i , we can use the Matlab codes given in Appendix A [21] to compute p i , q j,s ∈ R such that is an SOS, c i , e ij ∈ R and c i ≥ 0. If (21) and (22) cannot be found, return FAIL. (13) and (14) can be obtained from (21) and (22), respectively. Remark 2. Procedure 2 can be implemented automatically by Maple and Matlab on a computer. In Procedure 2, steps S2, S4 and S5 are based on the symbolic computation theory for reduction, which makes our method more efficient than the pure SDP-based method [18] or a direct theoretical proof [16]. The use of symbolic computation also ensures that our calculation is strict and free of numerical errors.

Remark 3.
Let R be an intrinsic constraint. Then, R becomes zero when replacing m i by its corresponding monomial in M m,n . Therefore, Span R ( C m,n,1 ) = Span R (C m,n,1 ) ⊂ Span R (C m,n ) in R[P n ]; that is, we do not need to include the intrinsic constraints in (21). However, these intrinsic constraints are needed when using the Matlab software in Appendix A of [21].

An Illustrative Example
As an illustrative example, we prove C 2 (3, 1) under the log-concave condition using the proof procedure given in Section 2.2. Since n = 1, denote In step 1, by Lemma 1 and (8), we have is a sixth-order differential form.
In step 2, we compute the constraints with Lemmas 2 and 3. With Lemma 2, we find six third-order integral constraints: C 3,1 = {R i , i = 1, . . . , 6}: With Lemma 3, we obtain one third-order log-concave constraint: In step 3, we use Procedure 2 to compute the SOS representation (13) and (14) with the input E 2,3,1 , 3 1 }, which are listed from high to low in the lexicographical monomial order.
S2. By writing monomials in C 3,1 as quadratic monomials in m i if possible and performing Gaussian elimination on C 3,1 , we have S3. There exist no intrinsic constraints and thus, C 3,1, By writing monomials in P 1 Q 1 as quadratic monomials if possible and using C 3,1,2 to eliminate non-quadratic monomials, we obtain

S5.
By writing E 2,3,1 as a quadratic form in m i , we have

S7. We obtain
From Theorem 1 and (23), we have Thus, an explicit and strict proof is given for C 2 (3, 1). Note that this example is also considered in [18] by the pure SDP-based method, which is a semi-automatic algorithm. See Table 1 for the time used to provide analytical proof of this example by our automatic method on a computer.

Compute the Third-Order Constraints
In step 2, we obtain the third-order constraints. We introduce the notation where a, b, c are variables taking values in [n]. Then, The third-order integral constraints are: i,a,b,c , : i = 1, . . . , 955; a, b, c ∈ [n]}, where R i,a,b,c in the form of lengthy formulas can be found in [23]. Note that we do not use all the third-order constraints in [23].

Proof of C 1 (3,2)
The proof follows Procedure 2 with E 1,3,2 given in (26) as the input. To make the proof explicit, we will give the key expressions. In Step S1, the new variables are M 3,2 and are listed in the lexicographical monomial order: In Step S2, the constraints are Removing the repeated ones, we have N 1 = 135. We obtain C 3,2,1 and C 3,2,2 , which contain 48 and 52 constraints, respectively. In Step S3, there exist 15 intrinsic constraints: Thus, C 3,2,1 contains 63 constraints and N 3 = 63.
Step S4 is not needed in the proof of this case. In Step S5, by eliminating the non-quadratic monomials in E 1,3,2 using C 3,2,2 to obtain a quadratic form in m i and then simplifying the quadratic form using C 3,2,1 , we have In Step S6, using the Matlab program in [23] with E 1,3,2 and C 3,2,1 as the input, we find an SOS representation for E 1,3,2 . Thus, by Theorem 1, C 1 (3, 2) is strictly proved.

Proof of C 1 (3,3)
The proof follows Procedure 2 with E 1,3,3 given in (29) as the input. The detailed lengthy formulas can be seen in [23]. In Step S1, the new variables are M 3,3 = {m i , i = 1, . . . , 38} which is the set of all monomials in R[P 3,3 ] with a degree of 3 and a total order of 3, and which are listed in the lexicographical monomial order.
Step S4 is not needed in the proof of this case. In Step S5, by eliminating the non-quadratic monomials in E 1,3,3 using C 3,3,2 and then simplifying the expression using C 3,3,1 , we obtain E 1,3,3 written as a quadratic form in m i . In Step S6, using the Matlab program in [23] with E 1,3,3 and C 3,3,1 as the input, we find an SOS representation for F 3,3 . Thus, using Theorem 1, C 1 (3, 3) is strictly proved.

Proof of C 1 (3,4)
The proof follows Procedure 2 with E 1,3,4 given in (29) as the input. The detailed lengthy formulas can be seen in [23]. In Step S1, the new variables are M 3,4 = {m i , i = 1, . . . , 80} which is the set of all monomials in R[P 3,4 ] with a degree of 3 and a total order of 3, and which are listed in the lexicographical monomial order.
Step S4 is not needed in the proof of this case. In Step S5, by eliminating the non-quadratic monomials in E 1,3,4 using C 3,4,2 to obtain a quadratic form in m i and then simplifying the quadratic form with C 3,4,1 , we obtain E 1,3,4 which is written as a quadratic form in m i . In Step S6, using the Matlab program in [23] with E 1,3,4 and C 3,4,1 as the input, we find an SOS representation for E 1,3,4 . Thus, using Theorem 1, C 1 (3, 4) is strictly proved.

Proof of C 3 (3,n) for n = 2, 3, 4 under the Log-Concave Condition
In this section, we use the procedure in Section 2.2 to prove C 3 (3, n) for n = 2, 3, 4 under the log-concave condition. The detailed lengthy formulas can be seen in [21].
Steps S1-S3 are the same with the proof of the case C 1 (3,2). In Step S4, we obtain C(3, 2) which contains three quadratic-form constraints. In Step S5, by eliminating the non-quadratic monomials in E 3,3,2 using C 3,2,2 to obtain a quadratic form in m i and then simplifying the quadratic form using C 3,2,1 , we have In Step S6, using the Matlab software in Appendix A [21] with E 3,3,2 , C 3,2,1 and C 3,2 as the input, we find an SOS representation for E 3,3,2 . Thus, C 3 (3, 2) is proved under the log-concave condition. The Maple program for proving C 3 (3,2) can be found at https://github.com/cmyuanmmrc/codeforepi/ (accessed on 15 July 2020).

Remark 4.
We fail to prove C 2 (3, 2) even under the log-concave condition using the above procedure. Specifically, we cannot find an SOS representation for E 2,3,2 in Step S6. Since the SDP algorithm is not complete for problem (21), we cannot say that an SOS representation does not exist for E 2,3,2 . The Maple program for C 2 (3,2) can be found at https://github.com/cmyuanmmrc/codeforepi/ (accessed on 15 July 2020).

Proof of C 3 (3,3) and C 3 (3,4)
In this subsection, we prove C 3 (3, 3), C 3 (3,4). Motivated by symmetric functions, for any function f (a, b, c), we have f (a, a, a) From (29) and (32), we obtain From (33), if we prove J 3,3,n ≥ 0, then E 3,3,n ≥ 0. It is clear that J 3,3,n has many fewer terms than E 3,3,n . In J 3,3,n given in (33) and the constraints in (28) and (31), we may consider ∂ ∂x a,t , ∂ and ∂ ∂x c,t as the differential operators without giving concrete values to a, b, and c. First, we prove C 3 (3, 3) using Procedure 2 with J 3,3,3 given in (33) and the constraints in (28) and (31) as the input. In Step S1, the new variables are M 3 = {m i , i = 1, . . . , 38}, which is the set of all the monomials in R[V a,b,c ] with a degree of 3 and a total order of 3.

Proof of C 3 (4,2)
In this section, we use the procedure in Section 2.2 to prove C 3 (4, 2) under the logconcave condition.
In step 3, we use Procedure 2 to compute the SOS representations (13) and (14) with E 3,4,n , C 4,2 , and C 4,2 as the input. In Step S1, the new variables are M 4,2 = {m i , i = 1, . . . , 33}, which is the set of all monomials in R[P 4,2 ] with a degree of 4 and a total order of 4, and which is listed in the lexicographical monomial order.
In order to use the SDP approach to prove more difficult problems, two kinds of improvements are needed. First, it is easy to see that the size of E s (m, n) and the numbers of the constraints increase exponentially as m and n become larger. Thus, we need to find certain rules which could be used to simplify the computation to solve problems such as C 1 (3, n)(n > 4) and C 3 (3, n)(n > 4) under the log-concave condition. Second, in many cases, such as C 1 (5, 1) and C 2 (3, 2) under the log-concave constraint, the SDP software terminates and gives a negative answer. Since the SDP method is not complete for our problem, we do not know whether an SOS representation exists. We thus need a complete method to solve problem (13). Another problem is to find more constraints besides those used in this paper in order to increase the power of the approach.