Square Root Convexity of Fisher Information along Heat Flow in Dimension Two

Recently, Ledoux, Nair, and Wang proved that the Fisher information along the heat flow is log-convex in dimension one, that is d2dt2log(I(Xt))≥0 for n=1, where Xt is a random variable with density function satisfying the heat equation. In this paper, we consider the high dimensional case and prove that the Fisher information is square root convex in dimension two, that is d2dt2IX≥0 for n=2. The proof is based on the semidefinite programming approach.


Introduction
Let X be a random variable defined on R n with density function f (x), which is assumed to be differentiable. The differential entropy H(X) and the Fisher information I(X) of X are, respectively, defined to be In 1948, Shannon [1] proposed the entropy power inequality (EPI) N X+Y ≥ N X + N Y , where X and Y are independent random variables defined by R n and N(X) := exp( 2 n H(X))/ (2πe). As one of the most important inequalities in information theory, Shannon's EPI has many proofs and applications [2][3][4][5][6].
In 1985, Costa [7] proved a generalization of Shannon's EPI, that is, the entropy power N(X t ) of X t = X + √ tZ is concave in t, where X is a random variable and Z = N(0, I n ) is the n-dimensional standard normal distribution, independent of X. This inequality also has many proofs and applications [8][9][10][11].
Costa also proved that d dt H(X t ) ≥ 0 and d 2 dt 2 H(X t ) ≤ 0 [7] (Corollary 1). Along this line, Cheng and Geng [12] proposed the completely monotone conjecture (CMC) (−1) m+1 d m dt m H(X t ) ≥ 0, m ∈ N + and proved the conjecture for m = 3, 4 and n = 1. Guo, Yuan, and Gao [13] proved the conjecture in the cases m = 3, n = 2, 3, 4 and the case m = 4, n = 2, using semidefinite programming (SDP) software programs. Other related results were also obtained based on the SDP approach [14,15]. The CMC was implicitly considered by Mckean [16] in studying the entropy for solutions of the heat equation u t = u. The density function of X t is a solution of the heat equation u t = 1 2 u [2]. Interestingly, the converse is also true; that is, if the density function of a random variable Y t is a solution of the heat equation, then Y t has the form of X t [11]. Thus, studying properties of H(X t ) and I(X t ) are equivalent to studying that of a probability measure satisfying the heat equation.
Cheng and Geng [12] also proposed the log-convexity conjecture: the Fisher information along the heat flow is log-convex, which can be deduced from CMC. In 2021, Ledoux, Nair, and Wang [17] proved the log-convexity conjecture for n = 1.
In this paper, we consider the two-dimensional case as suggested in [17]. We prove the square root convexity (abbr. sqrt-convexity) of Fisher information along heat flow in dimension two. Precisely, we prove the following result. Theorem 1. Let X be a random variable defined on R 2 , Z = N(0, I 2 ) a Gaussian variable independent of X, and X t = X + √ tZ. Then we have The main idea of the proof is that proof for inequality (1) can be reduced to the proof of whether a quadratic polynomial is a sum of squares (SOS) [18] of linear forms, which can be solved with SDP [19]. The SOS is explicitly given, which provides a rigourous proof for the theorem. The SDP problem related with Theorem 1 has 71 variables, which is difficult to solve by manual calculation.
We also show that log-convexity of the Fisher information along heat flow in dimension two cannot be proven with the SDP approach. More precisely, the SDP software program terminates, but fails to give a solution to prove the log-convexity. This does not imply that the log-convexity in dimension two is not correct, because the SOS problem to be solved with the SDP program is only a sufficient condition but not a necessary for the log-convexity. Theorem 1 is proven as a weaker form of the log-convexity conjecture for n = 2. We also show that Theorem 1 implies the CMC for the third-order derivative in dimension two without assuming the log-concavity of p(x). Refer to Corollary 1 for details.
In Theorem 1, we do not assume that X is a log-concave variable. If adding the logconcave condition, then from Toscani [20], 1 I(X t ) is concave, which implies inequality (1) and the proof can be found in Lemma 2. A drawback of the approach based on SDP is that the proof is difficult for people to check. Although the SOS gives an explicit proof for the theorem, it is quite large to be computed manually. To alleviate this problem, we give the programs and data in github.com, so that interested readers may check the proof using software systems. Refer to Remark 2 for details on how to do this. We also give an illustration for the method by proving Theorem 1 for the case n = 1 in Section 3.1. On the other hand, in the proof of information inequalities, it often happens that the computation is too large to be performed manually, and using computer programs becomes one of the major approaches in proving information inequalities [14,[21][22][23][24][25]. To show our result more intuitively, we give the figures of I(X t ) and log I(X t ) in Figure 1, where p(y 1 , y 2 ) in Equation (2) is ).
In this case, both I(X t ) and log I(X t ) are convex in t.

Notations and Preliminary Results
Let X be a random variable defined by R n with density function p(x), which is assumed to be differentiable and Z = N(0, I n ) the n-dimensional standard normal distribution, independent of X. Then X t = X + √ tZ is also a random variable defined on R n with density function which is differentiable since p(x) is. It is known that f (x, t) satisfies the heat Equation (2) The differential entropy H(X t ) and Fisher information I(X t ) of X t are, respectively, defined as For convenience, we use H(t) and I(t) to denote H(X t ) and I(X t ) in the rest of the paper. We can easily obtain the following relation between H(t) and I(t) by de Bruijn's identity [2]: By the definition of I(t), the Fisher information is always positive, so we can take the square root of it. By Equation (3) and the fact ∂ 2 ∂t 2 H(t) ≤ 0 [7], the first derivative of the Fisher information is always negative: A function f (t) is called sqrt-convex in t if the square root of f (t) is convex in t. The following lemma gives an equivalent form of sqrt-convexity, which will be used in the proof of Lemma 10.

Lemma 1. Theorem 1 is valid, that is, I(t) is sqrt-convex in t, if and only if
Proof. The convexity of I(t) is equivalent to the fact that second-order derivative of I(t) is positive. From Equation (4), we have Since I(t) > 0, the lemma is proven.

Corollary 1.
If I(t) is sqrt-convex in t for n = 2, then the CMC for the third-order with dimension two is correct.
Lemma 2 gives the relationship among sqrt-convexity, log-convexity, and concavity of 1 .
is concave in t, then log(I(t)) is convex in t. If log(I(t)) is convex in t, then I(t) is sqrt-convex in t.

Proof.
Since means that log(I(t)) is convex. Similarly, convexity of log(I(t)) means that I(t) We consider the two-dimensional case and suppose that the two variables are Then we can rewrite the Fisher information as and the heat equation as ∂ f ∂t By Equation (7), it is easy to see that for each In the following, we formally define the concept of differential forms, which are used to reduce the size of the SDP problems to be solved. Refer to Remark 1 for details.
is a finite linear combination of differential monomials over Q. A differential polynomial P is called the k-th order differentially homogenous polynomial, or simply a k-th order differential form, if each of its differential monomial is of total degree k and total order k.
In Lemma 3, we compute the expression of I(t), d dt I(t), d 2 dt 2 I(t).

Lemma 3.
We have where each I i is a 2i-th order differential form for i = 1, 2, 3.
Proof. By Equation (6), is a second-order differential form, so the lemma is correct for I 1 . For I 2 , Thus, I 2 is a fourth-order differential form. Similarly, we can show that I 3 is a sixthorder differential form: The lemma is proven.
Inspired by Cauchy-Schwarz inequality, we obtain the following inequality which is used in the proof of Lemma 9.

Lemma 4. For functions f
Proof. Using the Cauchy-Schwarz inequality, we have . Using the Cauchy-Schwarz inequality of integral form, we have Combining the above two inequalities, we prove the lemma.

Constraints
The density function f and its derivatives satisfy certain integral equations, from which the constraints of the SDP problems to be solved are obtained. Due to these reasons, these integral equations are called constraints. Precisely, a 2m-th order differential form R is called a 2m-th order constraint, if It is easy to see that the equations in (9) are still valid if I k is replaced by I k + C k , when C k is a 2k-th order constraint. Guo, Yuan, and Gao [13] proposed a method to compute the constraints, which will be used here to compute the constraints in dimension two. In the following, we show how to compute the 2m-order constraints.
This lemma guarantees that when using the integration by parts, the integral term of lower dimensions vanishes. The following lemma shows how to generate constraints. We repeat the proof here, because the proof procedure will be used in the proof of Lemma 7. Lemma 6. Let M be a differential monomial with total order 2m − 1. Then we can use integration by parts to obtain a 2m-th order constraint from M.
Proof. Let x a be one of the variables x 1 , x 2 , and x b be another variable. Then we have Then using integration by parts, we have is a 2m-th order constraint and the lemma is proven.

Proof of Theorem 1
The proof of Theorem 1 mainly consists of two steps. The first step, summarized in Lemma 10, is used to reduce the proof of Theorem 1 to the proof of the non-negativeness for a quadratic form with undetermined coefficients. This step is given in Sections 3.2-3.4. The reduction has three main ingredients: (1) Constraints given in Lemma 8 are used to form the SOS and Lemmas 5 and 6 show how to compute the constraints. (2) Lemma 7 is used to reduce all involved quantities into quadratic forms in certain variables. (3) By introducing J 3 in Lemma 9 and using the Cauchy-Schwarz inequality in Lemma 4, the quantity R 2 is relaxed to a simple form.
The second step, given in Section 3.5, is to compute the undetermined coefficients of the quadratic form using SDP, which is summarized as Problem 1. This step has two substeps: (1) In Problem 2, the undetermined coefficients α i and β j are computed by omitting the second degree terms. (2) In Problem 5, the undetermined coefficients λ k are computed using the values of α i and β j obtained in the first sub-step. In these two sub-steps, the quadratic forms are linear in the undetermined coefficients which can be computed with SDP and the computation procedure is given in Problems 3 and 4.

An Illustrative Example
In this subsection, we will prove Theorem 1 for n =1 and use this as an illustration of our proving method.
By Lemma 1, it suffices to prove (5). For convenience, we write f (x, t) as f and ∂ a ∂x a f (x, t) as f a . Using Lemma 6, we can obtain the constraintsÊ i , i = 1, . . . , 6: By Lemma 3, we have 6, which is a consequence of the following SOS: By (16) and (17), Theorem 1 of case n = 1 is proven. Equation (17) can be obtained in two steps. In the first step, we compute α.
) ≥ 0 under the constraints, which can be solved by SDP since α is linear in the expression. Suppose that the solution for α is α 0 .
In the second step, we check whether 2E 2 − E 1 (α 0 ) 2 ≥ 0 is valid under the constraints using SDP, and the SOS in (17) can be found. Details of the proof procedure are given in the rest of this section.

Compute Constraints
In this section, we compute the fourth-order and sixth-order constraints using Lemma 6. For instance, from the differential monomial M = f f 0,1 f 2,0 with total order 3, we obtain two fourth-order constraints: By considering all differential monomials with total order 3 and total degree 3, we obtain 20 constraints. Some of the constraints cannot be divided by f 0,1 or f 1,0 , which are not needed in the proof due to the form of I 2 in Equation (11). Finally, we obtain eight fourth-order constraints f 1,0 P i (1 ≤ i ≤ 4) and f 0, Similarly, we obtain 136 sixth-order constraints R j (1 ≤ j ≤ 136). In summary, we obtain constraints f 1,

Reduce to Quadratic Form
In order to obtain an SDP problem with a smaller size, we will reduce all differential polynomials in the proof into quadratic forms in a set of new variables M = {M i : 1 ≤ i ≤ 14} which are all the differential monomials with total order 3 and total degree 3: We rewrite F 1,0 , F 0,1 in Equation (12) and (18) as linear forms in M: The following lemma shows that any sixth-order constraint can be reduced to another sixth-order constraint which can be written as a quadratic form in M. Lemma 7. For any differential monomial M with total order 6 and total degree 6, we can compute a sixth-order differential form P such that and P is a quadratic form in M in Equation (20).
Proof. Since M is a differential monomial with total degree 6 and total order 6, let , and c s ≥ c k for s ≤ k. We call (c 1 , . . . , c 6 ) the order type and c 1 the leading order of M.
If c 1 ≥ 4, similar to the proof of Lemma 6, we can use integration by parts to obtain a new polynomial P 1 with leading order c 1 − 1.
where we assume a 1 ≥ 1, without loss of generality.
It is easy to see that P 1 is a sixth-order differential form. Since c 1 ≥ 4, we have c i ≤ 2 for i = 2, . . . , 6, and hence the leading orders of all monomials of P 1 are equal to or less than c 1 − 1. If the leading order of a monomial M of P 1 is still equal to or more than 4, we can repeat procedure (22) for M until the leading orders of all monomials of P 1 are equal to or less than 3.
All monomials with the above order types can be written as M i M j for certain M i , M j in Equation (20). For instance, the monomial f 4 f 3,0 f 2,1 has order type (3, 3, 0, 0, 0, 0), which can be written as M 1 M 2 . Thus, P is a quadratic form in variables M. The lemma is proven.
Using Lemma 7 to each monomial of I 3 in Equation (13), we obtain a quadratic form I 3 in M which satisfies Using Lemma 7 to all monomials of R j (1 ≤ j ≤ 136), we obtain R j which are quadratic forms in M. Doing Gaussian elimination to R j (1 ≤ j ≤ 136) to eliminate the linearly dependent ones, we obtain 48 constraints R j (1 ≤ j ≤ 48) which are given in Appendix B.
The variables in M satisfy certain relations, such as M 5 M 8 = f 2 f 2,0 f 1,1 f 1,0 f 0,1 = M 6 M 7 , which are called intrinsic constraints. We have 15 intrinsic constraints R i (49 ≤ i ≤ 63). In total, we have 63 sixth-order constraints which are quadratic forms in M: where R i (i = 1, . . . , 48) are given in Appendix B.
The following lemma summarizes all the constraints needed in the proof. (19), (21) and (25), we obtain the following fourth-order constraints and sixth-order constraints

Lemma 8. From Equations
where R j are quadratic forms in M and P i , Q i are linear forms in M.

Proof.
We need only to consider the equalities for R j (1 ≤ j ≤ 48). R i is obtained from R i by applying Lemma 7 to each monomial of R i . Then by Equation (19) and Lemma 7, we have . . , 136. R j are obtained from R j (1 ≤ j ≤ 136) by doing Gaussian elimination, so the R j are linear combinations of R j over Q. Thus R 2 R j f 5 dx 1 dx 2 = 0, j = 1, . . . , 48. The lemma is proven.

Reduction to Semidefinite Positiveness of a Quadratic Form
In this section, we give an Θ, which is a quadratic form in M, such that Theorem 1 is true if Θ ≥ 0, that is, Θ is a semidefinite positive polynomial when f a,b are treated as independent variables.
In the following key lemma, we introduceJ 3 in order to generate a common factor

Proof. J 3 is clearly a quadratic form in M. From Equations
The lemma is proven.
In Lemma 10, proof of Theorem 1 is finally reduced to the proof of an inequality for a quadratic form with undetermined coefficients. (23) and J 3 be defined in Equation (27). Then Theorem 1 is true if there exist α i , β i , γ j ∈ R such that

Lemma 10. Let I 3 be defined in Equation
where Θ is a quadratic form in M.
Proof. Θ is clearly a quadratic form in M, since I 3 and J 3 are. By Lemma 3, we have ≥ 0.

Prove Theorem 1 by Solving an SDP Problem
In this section, we will give an Θ in Equation (29) satisfying Θ≥ 0 and hence proving Theorem 1. By Lemma 10, in order to prove Theorem 1, it suffices to solve the following problem.
where I 3 is defined in Equation (23); R j are defined in Equation (25); and F 1,0 , F 0,1 , P i , Q i are defined in Equation (21).
It is impossible to compute α i , β i , γ j in Problem 1 with SDP directly, since Θ is not linear in α i , β i . We use the following strategy to solve Problem 1:

S1
Expanding the squares ( we obtain Problem 2 which is weaker than Problem 1.

S2
Since Θ in Problem 2 is linear in α i , β i , γ j , we can use SDP to solve Problem 2 and let Let Θ 1 be obtained from Θ by substituting α i , β i with α i , β i . Then, Θ 1 is linear in γ j and we can use SDP to compute γ j such that Θ 1 ≥ 0 is true. Under this condition, Problem 1 becomes Problem 5, and it suffices to solve Problem 5 in order to prove Theorem 1.
where I 3 is defined in Equation (23); R j are defined in Equation (25); and F 1,0 , F 0,1 , P i , Q i are defined in Equation (21).
Since Θ is a quadratic form in M, it is well known that Θ≥ 0 is equivalent to the fact that the symmetric matrix Θ∈ R 14×14 of Θ is positive semidefinite, that is, Θ 0 [19]. In other words, Problem 2 is equivalent to the following SDP problem [19].
where Q ∈ R n×n is the corresponding symmetric matrix for any quadratic form Q in M and n = |M| = 14.
We set the objective function to be 1, which means that it suffices to satisfy the constraints. We actually solve the following dual problem [19] of Problem 3:  where I:=2 I 3 − F 2 1,0 − F 2 0,1 , X ∈ R n×n , and n = |M| = 14.

Remark 1.
If not using differential forms to reduce the polynomials into quadratic forms in M, then we need to consider all differential monomials with total degree 3 and total order ≤ 6 as the bases for the SDP Problem 4. In such a case, n = 100 instead of n = 14, and we need to solve a much larger SDP problem for X ∈ R n×n .
We use the CVX package in Matlab [26] to solve Problem 4. The program is given in Appendix A. Our complete code and data are available (accessed on 30 November 2022) at https://github.com/liujunliang19/sqrt-convex.
With CVX, we obtain a set of solutions for γ j , α i , β i , which are given in Appendix C. From the above discussions, we see that these values are also solutions to Problem 2.
Finally, according to step S3 just above Problem 2, we put the solutions for α i , β i back into Θ in Problem 1 and obtain the following problem.
where I 3 is defined in Equation (23); R j are defined in Equation (25); and F 1,0 , F 0,1 , P i , Q i are defined in Equation (21).
Similar to Problems 3 and 4, we obtain a set of solutions for λ j , which are given in Appendix D. Now Θ 1 is a semi-positive quadratic form and it is well known that Θ 1 can be written as an SOS. The value of Θ 1 as well as its SOS representation are given in Appendix E. Hence, we solve Problem 1 and therefore prove Theorem 1.

Remark 2.
Note that the SOS given in Appendix E provides an explicit and direct proof for Theorem 1 and the solution procedure for the SDP is not needed, similar to Equation (17) for the case of n = 1. Of course, the SOS in Appendix E is quite large and difficult to check manually. In order for interested readers to check the proof with a mathematical software system, we also give the complete code and data in https://github.com/liujunliang19/sqrt-convex (accessed on 30 November 2022). The SOS expression for H 1 is in the bottom of our Maple code named sqrt-convex2.mw, which can be run directly.

Remark 3.
We also try to use the above approach to prove the log-convexity of the Fisher information along heat flow for n = 2. The CVX program returns failed. Thus, we cannot prove the log-convexity with the above approach. We also cannot say that the log-convexity is not correct, since the logconvexity is not equivalent to Problem 3.

Remark 4.
Theorem 1 is stronger than the CMC for the third-order derivative with dimension two. In other words, given Theorem 1, we can obtain d 3 dt 3 H(X t ) ≥ 0 (n = 2). Using Lemma 1,

Conclusions
In this paper, we prove the sqrt-convexity of Fisher Information along heat flow in dimension two. It is easy to find that this conclusion is weaker than the log-convexity conjecture. However, it is stronger than the CMC for the third-order derivative with dimension two.
The proof is based on the SDP method. In order to reduce the size of the SDP problem, we prove that any sixth-order differential form can be reduced to an "equivalent" differential polynomial which is a quadratic form in certain new variables. Based on this fact, we reduce the sixth-order differential forms into quadratic forms in a set of new variables, which reduces the size of the SDP problem significantly.
For possible future research directions, it is interesting to prove the sqrt-convexity for higher dimensions (n ≥ 3) using the method given in this paper. In this case, the main difficulty is to establish inequality (27) in higher dimensions. Another question is to prove the log-convexity by introducing more constraints or new methods to solve Problem 1 without using the relaxation method used in Problem 2. The methods introduced in this paper may be used to prove other EPI inequalities related with the heat equations.