Entropy Dissipation for Degenerate Stochastic Differential Equations via Sub-Riemannian Density Manifold

Feng, Qi; Li, Wuchen

doi:10.3390/e25050786

Open AccessArticle

Entropy Dissipation for Degenerate Stochastic Differential Equations via Sub-Riemannian Density Manifold

by

Qi Feng

^1,*

and

Wuchen Li

²

¹

Department of Mathematics, University of Michigan, Ann Arbor, MI 48109, USA

²

Department of Mathematics, University of South Carolina, Columbia, SC 29208, USA

^*

Author to whom correspondence should be addressed.

Entropy 2023, 25(5), 786; https://doi.org/10.3390/e25050786

Submission received: 29 March 2023 / Revised: 24 April 2023 / Accepted: 9 May 2023 / Published: 11 May 2023

(This article belongs to the Special Issue Information Geometry and Its Applications)

Download Review Reports Versions Notes

Abstract

We studied the dynamical behaviors of degenerate stochastic differential equations (SDEs). We selected an auxiliary Fisher information functional as the Lyapunov functional. Using generalized Fisher information, we conducted the Lyapunov exponential convergence analysis of degenerate SDEs. We derived the convergence rate condition by generalized Gamma calculus. Examples of the generalized Bochner’s formula are provided in the Heisenberg group, displacement group, and Martinet sub-Riemannian structure. We show that the generalized Bochner’s formula follows a generalized second-order calculus of Kullback–Leibler divergence in density space embedded with a sub-Riemannian-type optimal transport metric.

Keywords:

degenerate drift–diffusion process; Lyapunov methods; auxiliary Fisher information; sub-Riemannian density manifold; generalized Bochner’s formula

MSC:

53C17; 60D05; 58B20

1. Introduction

Consider the following Stratonovich stochastic differential equation:

d X_{t} = b (X_{t}) d t + \sqrt{2} a (X_{t}) \circ d B_{t},

(1)

where

(B_{t}^{1}, B_{t}^{2}, \dots, B_{t}^{n})

is an n-dimensional Brownian motion in

R^{n}

,

a \in R^{n + m} \to R^{(n + m) \times n}

is a matrix-valued function, and

b : R^{n + m} \to R^{n + m}

is a drift vector field. The convergence analysis of SDE (1) to its invariant distribution lies in the intersection of differential geometry, analysis, the Lie group (subgroup in quantum mechanics), and probability. The convergence analysis also has broad applications in designing fast algorithms in artificial intelligence (AI) and Bayesian sampling/optimization problems. One key question arises: How fast does the probability density function of SDE (1) converge to its invariant distribution?

The Gamma calculus, also named Bakry–Émery iterative calculus [1], provides analytical approaches to derive the convergence rate for SDE (1). This lower bound is known as the Ricci curvature lower bound. However, classical studies are limited to the non-degenerate diffusion coefficient matrix a. The classical Gamma calculus is no longer valid when a is a degenerate matrix function; see the generalization of Bakry–Émery calculus in [2].

This paper presents a Lyapunov convergence analysis for the degenerate diffusion process. We selected a class of z-Fisher information as the Lyapunov functional, where z is a matrix function different from matrix a. We derived a generalized Gamma calculus by the dissipation of the Lyapunov functional along the diffusion process. We then derived the generalized Bochner’s formula and obtained the exponential convergence condition. Several concrete examples are presented: gradient-drift–diffusions on the Heisenberg group, the displacement group, and the Martinet sub-Riemannian structure. Our approach extends the classical optimal transport geometry, in particular the second-order calculus of the relative entropy in the density manifold studied in [3,4,5,6].

The generalized Gamma calculus was first introduced by Baudoin–Garofalo [2] for sub-Riemannian manifolds. Related results were studied later in [7,8,9,10,11,12,13,14,15]. The commutative property of the iteration of

Γ_{1}

and

Γ_{1}^{z}

(Hypothesis

1.2

in [2]) was crucial in the previous works. Our algebraic Condition 1 does not have this requirement. We can remove this commutative condition in the weak sense. Thus, our results go beyond the step two-bracket-generating condition. We present algebraic conditions for the existence of the generalized Bochner’s formula.

On the other hand, optimal transport on the sub-Riemannian manifold was studied by [16,17,18,19]. An optimal transport metric on a sub-Riemannian manifold was proposed in [18,19]. In this case, the density manifold still forms an infinite-dimensional Riemannian manifold. The Monge–Ampère equation in sub-Riemannian settings was studied in [17]. Our approach is different. We introduced the sub-Riemannian density manifold (SDM) and studied its second-order geometric calculations of relative entropies in the SDM. Using those, we propose a new Gamma z calculus for degenerate stochastic differential equations and established the generalized curvature dimension-type bound. Besides, Refs. [20,21] used the analytical property of optimal transport to formulate the Ricci curvature lower bound in general metric space. Different from [20,21,22], we focused on the geometric calculations in the density manifold introduced by the z direction. Following the second-order geometric calculations in the density manifold, we formulated the new Gamma calculus and the corresponding Ricci curvature tensor for the sub-Riemannian manifold. Besides, our derivation also relates to the entropy methods [23,24]. Using entropy methods, Refs. [25,26] derived the convergence rate for degenerate drift–diffusion processes with constant diffusion coefficients a. Compared to previous works, we applied the entropy method with Gamma calculus and geometric calculations in the density manifold. It derives a generalized Gamma calculus from the dissipation of auxiliary Fisher information. Several concrete examples of convergence conditions are derived in the Lie-group-induced drift–diffusion processes.

We organize the paper as follows. We introduce the main result in Section 2. It is an explicit convergence rate condition for the density of degenerate SDEs in the

L^{1}

distance. In Section 3, we provide three examples of the proposed convergence analysis, including gradient-drift–diffusions on the Heisenberg group, the displacement group, and the Martinet sub-Riemannian structure. In Section 4, we present the Lyapunov analysis in the sub-Riemannian density manifold. The generalized Gamma calculus and the proof of the generalized Bochner’s formula is presented in Section 5. Some further discussions for other functional inequalities are presented in Section 6.

2. Main Results

In this section, we present this paper’s setting and main results.

2.1. Setting

Consider a Stratonovich SDE:

\begin{matrix} d X_{t} = b (X_{t}) d t + \sqrt{2} a (X_{t}) \circ d B_{t}, \end{matrix}

(2)

where

(B_{t}^{1}, B_{t}^{2}, \dots, B_{t}^{n})

is an n-dimensional Brownian motion in

R^{n}

,

a : R^{n + m} \to R^{(n + m) \times n}

is a matrix-valued function, and

b : R^{n + m} \to R^{n + m}

is a vector field. We refer to [27] (Section 3.13) for the definition of the Stratonovich SDE. According to [28] (Appendix A.7), the SDE (2) can also be written as the following Itô SDE:

d X_{l, t} = {\tilde{b}}_{l} (X_{t}) d t + \sum_{i = 1}^{n} \sqrt{2} a_{l i} (X_{t}) d B_{t}^{i}, for l = 1, \dots, n + m,

(3)

where

{\tilde{b}}_{l} = b_{l} + (\sum_{i = 1}^{n} \nabla_{a_{i}} a_{i})_{l}, for l = 1, \dots, n + m .

(4)

We denote

{a_{1}, \dots, a_{n}}

as the column vectors of matrix a, and

\sum_{i = 1}^{n} \nabla_{a_{i}} a_{i}

\in R^{n + m}

represents

(\sum_{i = 1}^{n} \nabla_{a_{i}} a_{i})_{l} = \sum_{i = 1}^{n} \sum_{k = 1}^{n + m} a_{k i} \frac{\partial a_{l i}}{\partial x_{k}}, for l = 1, \dots, n + m .

(5)

We denote

a^{T}

as the transpose of matrix a and denote

{a_{1}^{T}, \dots, a_{n}^{T}}

as the row vectors of matrix

a^{T}

. In particular, we have

a_{\hat{i} i} = a_{i \hat{i}}^{T}

, for

i = 1, \dots, n

and

\hat{i} = 1, \dots, n + m

. With some abuse of notation, we also denote

a_{i}^{T}

as the vector fields corresponding to the row vectors

a_{i}^{T}

, for

i = 1, \dots, n

. We assumed that

{a_{1}^{T} (x), a_{2}^{T} (x), \dots, a_{n}^{T} (x)}

satisfies the strong Hormander condition (or bracket-generating condition):

Span \{a_{1}^{T} (x), \dots, a_{n}^{T} (x), [a_{i_{1}}^{T}, \dots, [a_{i_{k - 1}}^{T}, a_{i_{k}}^{T}] \dots] (x), 1 \leq i_{1}, \dots, i_{k} \leq n, k \geq 2\} = R^{n + m},

where

[\cdot, \cdot]

represents the Lie bracket between two vector fields. The strong Hörmander condition means that the Lie algebra generated by the vector fields

{a_{1}^{T} (x), \dots, a_{n}^{T} (x)}

is of full rank at every point

x \in R^{n + m}

(see, e.g., [29] (Section 7.4)). This condition ensures the existence of a smooth probability density function of SDE (2); see the original proofs in [30,31]. For the simplicity of presentation, we assumed the probability density function is strictly positive. Indeed, the positivity of the density follows from the Hörmander condition [32]; for the more technical conditions to show the positivity by using Malliavin calculus, we refer to [33,34] (Theorem 1.4 with H = 1/2). Denote

X_{t} \sim ρ (t, x)

, where

ρ = ρ (t, x)

is the probability density function of SDE (2). The density function

ρ

satisfies the Fokker–Planck equation of SDE (2):

\partial_{t} ρ (t, x) = - \nabla_{x} \cdot (ρ (t, x) \tilde{b} (x)) + \sum_{i = 1}^{n + m} \sum_{j = 1}^{n + m} \frac{\partial^{2}}{\partial x_{i} \partial x_{j}} ({(a (x) a {(x)}^{T})}_{i j} ρ (t, x)),

(6)

with a smooth initial condition:

ρ_{0} (x) = ρ (0, x), \int_{R^{n + m}} ρ_{0} (x) d x = 1, ρ_{0} (x) > 0 .

In this paper, we assumed that SDE (2) has a unique invariant symmetric measure

μ

, where

d μ = π (x) d x

with

π \in C^{\infty} (R^{n + m})

. Here,

π

solves the equilibrium of Fokker–Planck Equation (6):

- \nabla_{x} \cdot (π (x) \tilde{b} (x)) + \sum_{i = 1}^{n + m} \sum_{j = 1}^{n + m} \frac{\partial^{2}}{\partial x_{i} \partial x_{j}} ({(a (x) a {(x)}^{T})}_{i j} π (x)) = 0 .

We studied a particular class of the vector field b for a given invariant distribution

π

.

Assumption (Gradient flow formulation): Suppose that b, a, and

π

satisfy the relation:

b = a \otimes \nabla a + a a^{T} \nabla log π,

(7)

where

a \otimes \nabla a \in R^{n + m}

represents, for

\hat{k} = 1, \dots, n + m

,

\begin{matrix} {(a \otimes \nabla a)}_{\hat{k}} & = & \sum_{k = 1}^{n} \sum_{k^{'} = 1}^{n + m} a_{\hat{k} k} \frac{\partial}{\partial x_{k^{'}}} a_{k^{'} k} . \end{matrix}

(8)

In the Itô formulation,

\tilde{b}

, a, and

π

satisfy

{\tilde{b}}_{l} = {(a a^{T} \nabla log π)}_{l} + {(a \otimes \nabla a)}_{l} + {(\sum_{i = 1}^{n} \nabla_{a_{i}} a_{i})}_{l},

for

l = 1, \dots, n + m

. In this case, we can reformulate Equation (6) as

\partial_{t} ρ (t, x) = \nabla \cdot (ρ (t, x) a (x) a {(x)}^{T} \nabla log \frac{ρ (t, x)}{π (x)}) .

(9)

We leave the derivation of Formula (9) in Appendix A. If

ρ (t, x) = π (x)

, then

log \frac{ρ (t, x)}{π (x)} = 0

, and

π

is an invariant density function for SDE (2). In Section 4, we demonstrate that Fokker–Planck Equation (6), or its equivalent Formulation (9), forms a “horizontal” gradient flow in the sub-Riemannian density manifold. We designed a Lyapunov functional to study the convergence behavior of this “horizontal” gradient flow (9).

Remark 1.

Formula (9) can be written as

\partial_{t} (log ρ (t, x) - log π (x)) ρ (t, x) = \nabla \cdot (ρ (t, x) a (x) a {(x)}^{T} \nabla log \frac{ρ (t, x)}{π (x)}) .

It has a weak formulation that

\int_{R^{n + m}} (\partial_{t} log \frac{ρ (t, x)}{π (x)}, ϕ (x)) ρ (t, x) d x = - \int_{R^{n + m}} (\nabla ϕ (x), a (x) a {(x)}^{T} \nabla log \frac{ρ (t, x)}{π (x)}) ρ (t, x) d x,

where

ϕ \in C^{\infty} (R^{n + m})

is a smooth test function.

Remark 2

(Non-gradient flow drift). In fact, the proposed method is not limited to the gradient flow assumption of the drift vector field b in (7). See the details in [35].

2.2. Main Result

We now briefly sketch the main results. Denote a sub-elliptic operator

L : C^{\infty} (R^{n + m}) \to C^{\infty} (R^{n + m})

as follows:

L f = \nabla \cdot (a a^{T} \nabla f) - {〈 a \otimes \nabla a, \nabla f 〉}_{R^{n + m}} + {〈 b, \nabla f 〉}_{R^{n + m}},

where

f \in C^{\infty} (R^{n + m})

.

Definition 1

(Generalized Gamma z calculus). Consider a smooth matrix function

z : R^{n + m} \to R^{(n + m) \times m}

. Denote Gamma one bilinear forms

Γ_{1}, Γ_{1}^{z} : C^{\infty} (R^{n + m}) \times C^{\infty} (R^{n + m}) \to C^{\infty} (R^{n + m})

as

Γ_{1} (f, g) = {〈 a^{T} \nabla f, a^{T} \nabla g 〉}_{R^{n}}, Γ_{1}^{z} (f, g) = {〈 z^{T} \nabla f, z^{T} \nabla g 〉}_{R^{m}} .

Define Gamma two bilinear forms

Γ_{2}, Γ_{2}^{z, π} : C^{\infty} (R^{n + m}) \times C^{\infty} (R^{n + m}) \to C^{\infty} (R^{n + m})

as

Γ_{2} (f, g) = \frac{1}{2} [L Γ_{1} (f, g) - Γ_{1} (L f, g) - Γ_{1} (f, L g)],

and

\begin{matrix} Γ_{2}^{z, π} (f, g) & = & \frac{1}{2} [L Γ_{1}^{z} (f, g) - Γ_{1}^{z} (L f, g) - Γ_{1}^{z} (f, L g)] \end{matrix}

(10)

\begin{matrix} + {div}_{z}^{π} (Γ_{1, \nabla (a a^{T})} (f, g)) - {div}_{a}^{π} (Γ_{1, \nabla (z z^{T})} (f, g)) . \end{matrix}

(11)

Here,

{div}_{a}^{π}

,

{div}_{z}^{π}

are divergence operators defined by

{div}_{a}^{π} (F) = \frac{1}{π} \nabla \cdot (π a a^{T} F), {div}_{z}^{π} (F) = \frac{1}{π} \nabla \cdot (π z z^{T} F),

for any smooth vector field

F \in R^{n + m}

, and

Γ_{\nabla (a a^{T})}

,

Γ_{\nabla (z z^{T})}

are vector Gamma one bilinear forms defined as

\begin{matrix} Γ_{1, \nabla (a a^{T)}} (f, g) & = & 〈 \nabla f, \nabla (a a^{T}) \nabla g 〉 = (〈 \nabla f, \frac{\partial}{\partial x_{\hat{k}}} (a a^{T}) \nabla g 〉)_{\hat{k} = 1}^{n + m}, \\ Γ_{1, \nabla (z z^{T)}} (f, g) & = & 〈 \nabla f, \nabla (a a^{T}) \nabla g 〉 = (〈 \nabla f, \frac{\partial}{\partial x_{\hat{k}}} (z z^{T}) \nabla g 〉)_{\hat{k} = 1}^{n + m}, \end{matrix}

with

\begin{matrix} {div}_{z}^{π} (Γ_{\nabla (a a^{T})} f, g) & = & \frac{\nabla \cdot (z z^{T} π 〈 \nabla f, \nabla (a a^{T}) \nabla g 〉)}{π}, \\ {div}_{a}^{π} (Γ_{\nabla (z z^{T})} f, g) & = & \frac{\nabla \cdot (a a^{T} π 〈 \nabla f, \nabla (z z^{T}) \nabla g 〉)}{π} . \end{matrix}

We next demonstrate that the summation of

Γ_{2}

and

Γ_{2}^{z, π}

can induce the following decomposition and bilinear forms. They are natural extensions of the classical Bakry–Émery calculus in the Riemannian manifold, i.e., non-degenerate matrix function a.

Notation 1.

For matrix function

a : R^{n + m} \to R^{(n + m) \times n}

, we define matrix Q as

\begin{matrix} Q = (\begin{matrix} a_{11}^{T} a_{11}^{T} & \dots & a_{1 (n + m)}^{T} a_{1 (n + m)}^{T} \\ \dots & a_{i \hat{i}}^{T} a_{k \hat{k}}^{T} & \dots \\ a_{n 1}^{T} a_{n 1}^{T} & \dots & a_{n (n + m)}^{T} a_{n (n + m)}^{T} \end{matrix}) \in R^{n^{2} \times {(n + m)}^{2}}, \end{matrix}

(12)

with

Q_{i k \hat{i} \hat{k}} = a_{i \hat{i}}^{T} a_{k \hat{k}}^{T}

. More precisely, for each row (respectively, column) of Q, the row (respectively column) indices of

Q_{i k \hat{i} \hat{k}}

follow

\sum_{i = 1}^{n} \sum_{k = 1}^{n}

(respectively,

\sum_{\hat{i} = 1}^{n + 1} \sum_{\hat{k} = 1}^{n + m}

). For matrix function

z : R^{n + m} \to R^{(n + m) \times m}

, we define matrix P as

\begin{matrix} P = (\begin{matrix} z_{11}^{T} a_{11}^{T} & \dots & z_{1 (n + m)}^{T} a_{1 (n + m)}^{T} \\ \dots & z_{i \hat{i}}^{T} a_{k \hat{k}}^{T} & \dots \\ z_{m \hat{1}}^{T} a_{n \hat{1}}^{T} & \dots & z_{m (n + m)}^{T} a_{n (n + m)}^{T} \end{matrix}) \in R^{(n m) \times {(n + m)}^{2}}, \end{matrix}

(13)

with

P_{i k \hat{i} \hat{k}} = z_{k \hat{k}}^{T} a_{i \hat{i}}^{T}

. For smooth function

f \in C^{\infty} (R^{n + m})

, for any

\hat{i}, \hat{k}, \hat{j} = 1, \dots, n + m

and

i, k = 1, \dots, n

(or

1, \dots, m)

, we define vector

C \in R^{{(n + m)}^{2} \times 1}

with components

\begin{matrix} C_{\hat{i} \hat{k}} = [\sum_{i, k = 1}^{n} \sum_{i^{'} = 1}^{n + m} ({〈 a_{i \hat{i}}^{T} a_{i i^{'}}^{T} (\frac{\partial a_{k \hat{k}}^{T}}{\partial x_{i^{'}}}), {(a^{T} \nabla)}_{k} f 〉}_{R^{n}} - {〈 a_{k i^{'}}^{T} a_{i \hat{k}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}, {(a^{T} \nabla)}_{k} f 〉}_{R^{n}})], \end{matrix}

(14)

where we denote

{(a^{T} \nabla)}_{k} f = \sum_{k^{'} = 1}^{n + m} a_{k k^{'}}^{T} \frac{\partial f}{\partial x_{k^{'}}}

. We define vector

D \in R^{n^{2} \times 1}

with components

\begin{matrix} D_{i k} = \sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}, and D^{T} D = \sum_{i, k} D_{i k} D_{i k} . \end{matrix}

(15)

We define vector

F \in R^{{(n + m)}^{2} \times 1}

with components

\begin{matrix} F_{\hat{i} \hat{k}} = [\sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'} = 1}^{n + m} ({〈 a_{i \hat{i}}^{T} a_{i i^{'}}^{T} (\frac{\partial z_{k \hat{k}}^{T}}{\partial x_{i^{'}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} - {〈 z_{k i^{'}}^{T} a_{i \hat{k}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}, {(z^{T} \nabla)}_{k} f 〉}_{R^{m}})] . \end{matrix}

(16)

We define vector

E \in R^{(n \times m) \times 1}

with components

\begin{matrix} E_{i k} = \sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} \frac{\partial z_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}, and E^{T} E = \sum_{i, k} E_{i k} E_{i k} . \end{matrix}

(17)

We define vector

G \in R^{{(n + m)}^{2} \times 1}

with components

\begin{matrix} G_{\hat{i} \hat{j}} & = & \sum_{i = 1}^{n} \sum_{j = 1}^{m} \sum_{j^{'}, \hat{j}, i^{'}, \hat{i} = 1}^{n + m} [(z_{j \hat{j}}^{T} z_{j j^{'}}^{T} \frac{\partial}{\partial x_{j^{'}}} a_{i \hat{i}}^{T} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} + z_{j \hat{j}}^{T} z_{j j^{'}}^{T} \frac{\partial}{\partial x_{j^{'}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} a_{i \hat{i}}^{T}) \\ - (a_{i \hat{i}}^{T} a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}} z_{j \hat{j}}^{T} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} + a_{i \hat{i}}^{T} a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} z_{j \hat{j}}^{T})] . \end{matrix}

(18)

We define X as the vectorization of the Hessian matrix of function f:

\begin{matrix} X^{T} = (\begin{matrix} \frac{\partial^{2} f}{\partial x_{1} \partial x_{1}} & \dots & \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{k}}} & \dots & \frac{\partial^{2} f}{\partial x_{n + m} \partial x_{n + m}} \end{matrix}) \in R^{1 \times {(n + m)}^{2}} . \end{matrix}

(19)

Assumption 1.

Assume that there exists vectors

Λ_{1}, Λ_{2} \in R^{{(n + m)}^{2} \times 1}

such that

\begin{matrix} {(Q^{T} Q Λ_{1} + P^{T} P Λ_{2})}^{T} X & = & {(F + C + G + Q^{T} D + P^{T} E)}^{T} X . \end{matrix}

Definition 2

(Hessian matrix). For smooth function

f \in C^{\infty} (R^{n + m})

, define a matrix function

R : R^{n + m} \to R^{(n + m) \times (n + m)}

as

\begin{matrix} R (\nabla f, \nabla f) = - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E + (R_{a b} + R_{z b} + R_{π}) (\nabla f, \nabla f), \end{matrix}

where we define the following bilinear forms:

\begin{matrix} R_{a b} (\nabla f, \nabla f) & = & R_{a} (\nabla f, \nabla f) - \sum_{i = 1}^{n} \sum_{\hat{i}, \hat{k} = 1}^{n + m} {〈 (a_{i \hat{i}}^{T} \frac{\partial b_{\hat{k}}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}} - b_{\hat{k}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(a^{T} \nabla f)}_{i} 〉}_{R^{n}}, \\ R_{a} (\nabla f, \nabla f) & = & \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉}_{R^{n}} \\ + \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉}_{R^{n}} \\ - \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{k \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(a^{T} \nabla)}_{k} {f 〉}_{R^{n}} \\ - \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 a_{k \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}}, {(a^{T} \nabla)}_{k} f 〉}_{R^{n}}, \\ R_{z b} (\nabla f, \nabla f) & = & R_{z} (\nabla f, \nabla f) - \sum_{i = 1}^{m} \sum_{\hat{i}, \hat{k} = 1}^{n + m} {〈 (z_{i \hat{i}}^{T} \frac{\partial b_{\hat{k}}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}} - b_{\hat{k}} \frac{\partial z_{i \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(z^{T} \nabla f)}_{i} 〉}_{R^{m}}, \\ R_{z} (\nabla f, \nabla f) & = & \sum_{i = 1}^{n} \sum_{k =}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial z_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \\ + \sum_{i = 1}^{n} \sum_{k =}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial z_{k \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \\ - \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 z_{k \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(z^{T} \nabla)}_{k} {f 〉}_{R^{m}} \\ - \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 z_{k \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}}, {(z^{T} \nabla)}_{k} f 〉}_{R^{m}}, \end{matrix}

and

\begin{matrix} R_{π} (\nabla f, \nabla f) & = & 2 \sum_{k = 1}^{m} \sum_{i = 1}^{n} \sum_{k^{'}, \hat{k}, \hat{i}, i^{'} = 1}^{n + m} [\frac{\partial}{\partial x_{k^{'}}} z_{k k^{'}}^{T} z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}}] \\ + 2 \sum_{k = 1}^{m} \sum_{i = 1}^{n} \sum_{k^{'}, \hat{k}, \hat{i}, i^{'} = 1}^{n + m} [z_{k k^{'}}^{T} \frac{\partial}{\partial x_{k^{'}}} z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} \\ + z_{k k^{'}}^{T} z_{k \hat{k}}^{T} \frac{\partial^{2}}{\partial x_{k^{'}} \partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} \\ + z_{k k^{'}}^{T} z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} \frac{\partial}{\partial x_{k^{'}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}}] \\ + 2 \sum_{k = 1}^{m} \sum_{i = 1}^{n} \sum_{\hat{k}, \hat{i}, i^{'} = 1}^{n + m} {(z^{T} \nabla log π)}_{k} [z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}}] \\ - 2 \sum_{j = 1}^{m} \sum_{l = 1}^{n} \sum_{l^{'}, \hat{l}, \hat{j}, j^{'} = 1}^{n + m} [\frac{\partial}{\partial x_{l^{'}}} a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}}] \\ - 2 \sum_{j = 1}^{m} \sum_{l = 1}^{n} \sum_{l^{'}, \hat{l}, \hat{j}, j^{'} = 1}^{n + m} [a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} \\ + a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial^{2}}{\partial x_{l^{'}} \partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} \\ + a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} \frac{\partial}{\partial x_{l^{'}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}}] \\ - 2 \sum_{j = 1}^{m} \sum_{l = 1}^{n} \sum_{\hat{l}, \hat{j}, j^{'} = 1}^{n + m} {(a^{T} \nabla log π)}_{l} [a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}}] . \end{matrix}

Here, we also denote

R = R (x) \in R^{(n + m) \times (n + m)}

, such that

{(\nabla f)}^{T} R (x) \nabla f = R (\nabla f, \nabla f)

.

The main theorem is presented below, and its proof is postponed to Theorem 3 in Section 5.

Theorem 1

(Generalized z Bochner’s formula). If Assumption 1 is satisfied, then the following decomposition holds:

\begin{matrix} Γ_{2} (f, f) + Γ_{2}^{z, π} (f, f) & = & ∥ {Hess}_{a, z} {f ∥}^{2} + R (\nabla f, \nabla f), \end{matrix}

where we define

\begin{matrix} ∥ {Hess}_{a, z} {f ∥}^{2} & = & {[X + Λ_{1}]}^{T} Q^{T} Q [X + Λ_{1}] + {[X + Λ_{2}]}^{T} P^{T} P [X + Λ_{2}], \\ R (\nabla f, \nabla f) & = & - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E \\ + R_{a b} (\nabla f, \nabla f) + R_{z b} (\nabla f, \nabla f) + R^{π} (\nabla f, \nabla f) . \end{matrix}

We are now ready to prove the convergence property of the degenerate drift–diffusion process (1) and related functional inequalities. Denote the Kullback–Leibler divergence as

D_{KL} (ρ ∥ π) : = \int_{R^{n + m}} ρ (x) log \frac{ρ (x)}{π (x)} d x .

Denote the

a, z

-relative Fisher information functional as

\begin{matrix} I_{a, z} (ρ) : = \int_{R^{n + m}} (\nabla log \frac{ρ}{π}, a a^{T} \nabla log \frac{ρ}{π}) ρ d x + \int_{R^{n + m}} (\nabla log \frac{ρ}{π}, z z^{T} \nabla log \frac{ρ}{π}) ρ d x . \end{matrix}

Theorem 2

(Exponential convergence in the

L^{1}

distance). Suppose there exists a constant

κ > 0

such that

R ⪰ κ (a a^{T} + z z^{T}) .

Let

ρ_{0}

be a smooth initial distribution and

ρ = ρ (t, x)

be the probability density function of (1). Then, ρ converges to the invariant measure π in the sense of

I_{a, z} (ρ) \leq e^{- 2 κ t} I_{a, z} (ρ_{0}) .

In addition,

\int_{R^{n + m}} | ρ (t, x) - π (x) | d x \leq \sqrt{2 D_{KL} (ρ_{0} ∥ π)} e^{- κ t} .

The proof of Theorem 2 is postponed to Proposition (14).

Remark 3

(Functional inequalities). Suppose

R ⪰ κ (a a^{T} + z z^{T})

with

κ > 0

, then the z-log-Sobolev inequalities hold:

\int_{R^{n + m}} ρ log \frac{ρ}{π} d x \leq \frac{1}{2 κ} I_{a, z} (ρ),

for any smooth density function ρ.

Remark 4.

In the literature [2], the

Γ_{2, z}

operator is defined by (10), i.e.,

Γ_{2}^{z} (f, f) = \frac{1}{2} L Γ_{1}^{z} (f, f) - Γ_{1}^{z} (L f, f)

. In fact, this definition is under the assumption of

Γ_{1} (Γ_{1}^{z} (f, f), f) = Γ_{1}^{z} (Γ_{1} (f, f), f)

. This assumption holds true only for the special choice of a and z. In the generalized Gamma z calculus, we introduce a new term (11), which removes the assumption

Γ_{1} (Γ_{1}^{z} (f, f), f) = Γ_{1}^{z} (Γ_{1} (f, f), f)

. In fact, in the paper, we show that (11) is exactly the new bilinear form behind the assumption in [2] by considering the weak form.

Remark 5.

Following [35] (Assumption 1), we know that, for any

i \in {1, \dots, n}

and

k \in {1, \dots, m}

, if

\begin{matrix} z_{k}^{T} \nabla a_{i}^{T} \in Span {a_{1}^{T}, \dots, a_{n}^{T}}, \end{matrix}

(20)

there exist vectors

{\hat{Λ}}_{1}

and

{\hat{Λ}}_{2}

, such that the Hessian operator associated with the generator of the SDE and the metric

{(a a^{T})}^{†}

could be represented as

{∥ Hess f ∥}^{2} = {[Q X + {\hat{Λ}}_{1}]}^{T} [Q X + {\hat{Λ}}_{1}] + {[P X + {\hat{Λ}}_{2}]}^{T} [P X + {\hat{Λ}}_{2}] .

Furthermore, we have the following relation:

\begin{matrix} {[Q X + {\hat{Λ}}_{1}]}^{T} [Q X + {\hat{Λ}}_{1}] + {[P X + {\hat{Λ}}_{2}]}^{T} [P X + {\hat{Λ}}_{2}] - {\hat{Λ}}_{1}^{T} {\hat{Λ}}_{1} - {\hat{Λ}}_{2}^{T} {\hat{Λ}}_{2} \\ = & {[X + Λ_{1}]}^{T} Q^{T} Q [X + Λ_{1}] + {[X + Λ_{2}]}^{T} P^{T} P [X + Λ_{2}] - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2}, \end{matrix}

if there exist

Λ_{1}

and

Λ_{2}

as in Assumption 1 such that

\begin{matrix} {\hat{Λ}}_{1}^{T} = Λ_{1}^{T} Q^{T} and {\hat{Λ}}_{2}^{T} = Λ_{2}^{T} P^{T} . \end{matrix}

(21)

Assumption 1 is true if Conditions 20 and 21 hold. See the detailed connections in [35] (Remark 11).

3. Examples

In this section, we consider the following degenerate drift–diffusion process:

d X_{t} = - a (X_{t}) a {(X_{t})}^{T} \nabla V (X_{t}) d t + \sqrt{2} a (X_{t}) \circ d B_{t},

(22)

where

a : R^{n + m} \to R^{(n + m) \times n}

is a matrix-valued function, for

n, m \in Z_{+}

, and

V \in C^{\infty} (R^{n + m})

is a smooth potential function. We denote the invariant measure of SDE (22) as

π

. We further assumed that

\begin{matrix} - a a^{T} \nabla V = a \otimes \nabla a + a a^{T} \nabla log π . \end{matrix}

The above assumption holds for the later three examples.

Remark 6.

For

V = 0

, the invariant measure π in the above assumption exists if

{a_{1}, \dots, a_{n}}

forms left-invariant structures on unimodular Lie groups. In this case, the sub-Laplacian is the sum of squares of horizontal vector fields and the invariant measure is also symmetric. Stratonovich SDE (22) defines the horizontal Brownian motion on sub-Riemannian structure

(R^{n + m}, τ, {(a a^{T})}^{†} |_{τ})

, and π is the volume form associated with the horizontal Laplacian. In general, if the Lie group structure is not unimodular, the drift

b \neq 0

. See the related studies about the diffusion process on general manifolds in [36,37,38,39,40,41,42,43]. See the related studies on log-Sobolev inequality in [44,45].

Remark 7.

It is also worth mentioning that many sub-Riemannian manifolds are non-compact. Hence, there may not exist a positive constant κ for both classical

Γ_{1}

and

Γ_{1}^{z}

directions in the non-compact domain. The non-compactness of the domain brings additional difficulties. To prove the associated inequalities in this case, we need to extend the result derived in [46,47]. This is a direction for future work.

Remark 8.

It is known that the Heisenberg group is an example of Lie groups in quantum mechanics [48]. In future work, we shall investigate the general convergence analysis of SDEs in Lie groups and their connections with quantum SDEs.

3.1. Heisenberg Group

In this subsection, we apply our general theory to the well-known example in sub-Riemannian geometry, which is the Heisenberg group. A related LSI for the horizontal Wiener measure was studied in [46]. Recall briefly that the Heisenberg group

H^{1}

admits left-invariant vector fields:

X = \frac{\partial}{\partial x} - \frac{1}{2} y \frac{\partial}{\partial z}, Y = \frac{\partial}{\partial y} + \frac{1}{2} x \frac{\partial}{\partial z}, Z = \frac{\partial}{\partial z}

. Here,

{X, Y, Z}

forms an orthonormal basis for the tangent bundle of

H^{1}

. In this case,

π = e^{- V}

. In particular, X and Y generate the horizontal distribution

τ

. To fit into our general theory from the previous section, we take matrices a and z as below:

\begin{matrix} a^{T} = (\begin{matrix} 1 & 0 & - y / 2 \\ 0 & 1 & x / 2 \end{matrix}), z^{T} = (0, 0, 1) . \end{matrix}

(23)

In particular, we have

\begin{matrix} a^{T} \nabla f = {({(a^{T} \nabla)}_{1} f, {(a^{T} \nabla)}_{2} f)}^{T}, {(a^{T} \nabla)}_{1} f = (\frac{\partial f}{\partial x} - \frac{y}{2} \frac{\partial f}{\partial z}), {(a^{T} \nabla)}_{2} f = (\frac{\partial f}{\partial y} + \frac{x}{2} \frac{\partial f}{\partial z}) . \end{matrix}

We have the following proposition for Heisenberg group following Theorem 1.

Proposition 1.

For any smooth function

f \in C^{\infty} (H^{1})

, one has

\begin{matrix} Γ_{2} (f, f) + Γ_{2}^{z, π} (f, f) & = & ∥ {Hess}_{a, z} {f ∥}^{2} + R (\nabla f, \nabla f), \end{matrix}

where

\begin{matrix} Λ_{1}^{T} & = & (0, 0, 0, 0, 0, 0, 0, 0, 0); \\ Λ_{2}^{T} & = & (0, 0, 0, 0, 0, 0, {(a^{T} \nabla)}_{2} f, - {(a^{T} \nabla)}_{1} f, 0); \\ R_{a b} (\nabla f, \nabla f) - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E \\ = & - Γ_{1} (f, f) + \frac{1}{2} Γ_{1}^{z} (f, f) - {(a^{T} \nabla)}_{1} V \partial_{z} f {(a^{T} \nabla)}_{2} f \\ + {(a^{T} \nabla)}_{2} V \partial_{z} f {(a^{T} \nabla)}_{1} f \\ + [\frac{\partial^{2} V}{\partial x \partial x} + \frac{y^{2}}{4} \frac{\partial^{2} V}{\partial z \partial z} - y \frac{\partial^{2} V}{\partial x \partial z}] {| {(a^{T} \nabla)}_{1} f |}^{2} \\ + [\frac{\partial^{2} V}{\partial y \partial y} + \frac{x^{2}}{4} \frac{\partial^{2} V}{\partial z \partial z} + x \frac{\partial^{2} V}{\partial y \partial z}] {| {(a^{T} \nabla)}_{2} f |}^{2} \\ + 2 [\frac{\partial^{2} V}{\partial x \partial y} + \frac{x}{2} \frac{\partial^{2} V}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} V}{\partial y \partial z} - \frac{x y}{4} \frac{\partial^{2} V}{\partial z \partial z}] {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f; \\ R_{z b} (\nabla f, \nabla f) & = & (\frac{\partial^{2} V}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} V}{\partial z \partial z}) {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f \\ + (\frac{\partial^{2} V}{\partial y \partial z} + \frac{x}{2} \frac{\partial^{2} V}{\partial z \partial z}) {(z^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f; \\ R_{π} (\nabla f, \nabla f) & = & 0 . \end{matrix}

The proof of Proposition of 1 follows from the proof of Theorem 1 (i.e., Theorem 3) and Lemmas 1–3. The following convergence result follows directly from Theorem 2.

Proposition 2.

If there exists

κ > 0

as shown in Theorem 2, the exponential dissipation result in the

L^{1}

distance holds:

\int | ρ (t, x) - π (x) | d x = O (e^{- κ t}) .

We next formulate the curvature tensor into a matrix format. Denote

\begin{matrix} U = {({(a^{T} \nabla)}_{1} f, {(a^{T} \nabla)}_{2} f, {(z^{T} \nabla)}_{1} f)}_{3 \times 1}, \end{matrix}

(24)

and denote

I_{3 \times 3}

as the identity matrix. With a little abuse of notation, there exists a symmetric matrix

R

such that we can represent the tensor as below.

\begin{matrix} R (\nabla f, \nabla f) = {(U)}^{T} \cdot R \cdot U, \end{matrix}

(25)

which implies that

\begin{matrix} R ⪰ κ (a a^{T} + z z^{T}) \Rightarrow R (\nabla f, \nabla f) ⪰ κ (Γ_{1} (f, f) + Γ_{1}^{z} (f, f)) . \end{matrix}

In other words, we need to estimate the smallest eigenvalue of matrix

R

. We next present the formulation of matrix

R

for the Heisenberg group as follows.

Corollary 1.

The matrix

R

associated with the Heisenberg group has the following form:

\begin{matrix} R_{11} & = & [\frac{\partial^{2} V}{\partial x \partial x} + \frac{y^{2}}{4} \frac{\partial^{2} V}{\partial z \partial z} - y \frac{\partial^{2} V}{\partial x \partial z}] - 1; \\ R_{22} & = & [\frac{\partial^{2} V}{\partial y \partial y} + \frac{x^{2}}{4} \frac{\partial^{2} V}{\partial z \partial z} + x \frac{\partial^{2} V}{\partial y \partial z}] - 1; R_{33} = \frac{1}{2}; \\ R_{12} & = & R_{21} = [\frac{\partial^{2} V}{\partial x \partial y} + \frac{x}{2} \frac{\partial^{2} V}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} V}{\partial y \partial z} - \frac{x y}{4} \frac{\partial^{2} V}{\partial z \partial z}]; \\ R_{13} & = & R_{31} = \frac{1}{2} {(a^{T} \nabla)}_{2} V + \frac{1}{2} (\frac{\partial^{2} V}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} V}{\partial z \partial z}); \\ R_{23} & = & R_{32} = - \frac{1}{2} {(a^{T} \nabla)}_{1} V + \frac{1}{2} (\frac{\partial^{2} V}{\partial y \partial z} + \frac{x}{2} \frac{\partial^{2} V}{\partial z \partial z}) . \end{matrix}

Proof.

The explicit form of matrix

R

follows from the definition in Theorem 1 and the notation in (24) and (25). We have

\begin{matrix} R (\nabla f, \nabla f) & = & - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E \\ + R_{a b} (\nabla f, \nabla f) + R_{z b} (\nabla f, \nabla f) + R^{π} (\nabla f, \nabla f) \\ = & {(U)}^{T} \cdot R \cdot U . \end{matrix}

Plugging the explicit representation from Proposition 1 into the above formula and applying matrix symmetrization for the off-diagonal terms, we obtain the desired matrix

R

. □

Next, we present the three key lemmas.

Lemma 1.

For the Heisenberg group, we have

\begin{matrix} Q & = & (\begin{matrix} 1 & 0 & - \frac{y}{2} & 0 & 0 & 0 & - \frac{y}{2} & 0 & \frac{y^{2}}{4} \\ 0 & 1 & \frac{x}{2} & 0 & 0 & 0 & 0 & - \frac{y}{2} & - \frac{x y}{4} \\ 0 & 0 & 0 & 1 & 0 & - \frac{y}{2} & \frac{x}{2} & 0 & - \frac{x y}{4} \\ 0 & 0 & 0 & 0 & 1 & \frac{x}{2} & 0 & \frac{x}{2} & \frac{x^{2}}{4} \end{matrix}); \\ P & = & (\begin{matrix} 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & - \frac{y}{2} \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & \frac{x}{2} \end{matrix}); \\ D^{T} & = & (0, \frac{1}{2} \partial_{z} f, - \frac{1}{2} \partial_{z} f, 0); E^{T} = (0, 0); \\ F^{T} & = & G^{T} = (0, 0, 0, 0, 0, 0, 0, 0, 0); \\ C^{T} & = & (0, 0, \frac{x}{4} \partial_{z} f + \frac{1}{2} \partial_{y} f, 0, 0, \frac{y}{4} \partial_{z} f - \frac{1}{2} \partial_{x} f \\ , \frac{x}{4} \partial_{z} f + \frac{1}{2} \partial_{y} f, \frac{y}{4} \partial_{z} f - \frac{1}{2} \partial_{x} f, - \frac{y}{2} \partial_{y} f - \frac{x}{2} \partial_{x} f) . \end{matrix}

Proof.

The proof of this lemma follows from routine computations. Plugging matrices a and z from (23) into Notation 1, we obtain the desired vectors and matrices. We skip the detailed computation here. □

Lemma 2.

On

H^{1}

, vectors F and G are zero vectors, and we have

\begin{matrix} {[Q X + D]}^{T} [Q X + D] + {[P X + E]}^{T} [P X + E] + 2 C^{T} X \\ = & ∥ {Hess}_{a, z} {f ∥}^{2} - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E . \end{matrix}

In particular, we have

\begin{matrix} ∥ {Hess}_{a, z} {f ∥}^{2} & = & {[X + Λ_{1}]}^{T} Q^{T} Q [X + Λ_{1}] + {[X + Λ_{2}]}^{T} P^{T} P [X + Λ_{2}]; \\ Λ_{1}^{T} & = & (0, 0, 0, 0, 0, 0, 0, 0, 0); \\ Λ_{2}^{T} & = & (0, 0, 0, 0, 0, 0, {(a^{T} \nabla)}_{2} f, - {(a^{T} \nabla)}_{1} f, 0); \\ - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E) \\ = & - Γ_{1} (f, f) + \frac{1}{2} Γ_{1}^{z} (f, f) . \end{matrix}

Lemma 3.

By routine computations, we obtain

\begin{matrix} R_{a b} (\nabla f, \nabla f) & = & - {(a^{T} \nabla)}_{1} V \partial_{z} f {(a^{T} \nabla)}_{2} f + {(a^{T} \nabla)}_{2} V \partial_{z} f {(a^{T} \nabla)}_{1} f \\ + [\frac{\partial^{2} V}{\partial x \partial x} + \frac{y^{2}}{4} \frac{\partial^{2} V}{\partial z \partial z} - y \frac{\partial^{2} V}{\partial x \partial z}] {| {(a^{T} \nabla)}_{1} f |}^{2} \\ + [\frac{\partial^{2} V}{\partial y \partial y} + \frac{x^{2}}{4} \frac{\partial^{2} V}{\partial z \partial z} + x \frac{\partial^{2} V}{\partial y \partial z}] {| {(a^{T} \nabla)}_{2} f |}^{2} \\ + 2 [\frac{\partial^{2} V}{\partial x \partial y} + \frac{x}{2} \frac{\partial^{2} V}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} V}{\partial y \partial z} - \frac{x y}{4} \frac{\partial^{2} V}{\partial z \partial z}] {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f; \\ R_{z b} (\nabla f, \nabla f) & = & (\frac{\partial^{2} V}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} V}{\partial z \partial z}) {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f \\ + (\frac{\partial^{2} V}{\partial y \partial z} + \frac{x}{2} \frac{\partial^{2} V}{\partial z \partial z}) {(z^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f; \\ R_{π} (\nabla f, \nabla f) & = & 0 . \end{matrix}

Proof of Lemma 2.

We first have

\begin{matrix} 2 C^{T} X & = & \sum_{\hat{i}, \hat{k} = 1}^{3} 2 C_{\hat{i} \hat{k}}^{T} X_{\hat{i} \hat{k}} \\ = & 2 [\frac{\partial^{2} f}{\partial x \partial z} \frac{{(a^{T} \nabla)}_{2} f}{2} - \frac{\partial^{2} f}{\partial y \partial z} \frac{{(a^{T} \nabla)}_{1} f}{2} + \frac{\partial^{2} f}{\partial z \partial x} \frac{{(a^{T} \nabla)}_{2} f}{2}] \\ - 2 [\frac{\partial^{2} f}{\partial z \partial y} \frac{{(a^{T} \nabla)}_{1} f}{2} + \frac{\partial^{2} f}{\partial z \partial z} (\frac{y}{2} \partial_{y} f + \frac{x}{2} \partial_{x} f)] \\ = & 2 \frac{\partial^{2} f}{\partial x \partial z} {(a^{T} \nabla)}_{2} f - 2 \frac{\partial^{2} f}{\partial y \partial z} {(a^{T} \nabla)}_{1} f - 2 \frac{\partial^{2} f}{\partial z \partial z} (\frac{y}{2} \partial_{y} f + \frac{x}{2} \partial_{x} f) \\ = & 2 {(a^{T} \nabla)}_{2} f [\frac{\partial^{2} f}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} f}{\partial z \partial z}] - 2 {(a^{T} \nabla)}_{1} f [\frac{\partial^{2} f}{\partial y \partial z} + \frac{x}{2} \frac{\partial^{2} f}{\partial z \partial z}] . \end{matrix}

By direct computations, we have

\begin{matrix} {[Q X + D]}^{T} [Q X + D] + {[P X + E]}^{T} [P X + E] + 2 C^{T} X \\ = & {[\frac{\partial^{2} f}{\partial x \partial x} - y \frac{\partial^{2} f}{\partial x \partial z} + \frac{y^{2}}{4} \frac{\partial^{2} f}{\partial z \partial z}]}^{2} + {[\frac{\partial^{2} f}{\partial x \partial y} + \frac{x}{2} \frac{\partial^{2} f}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} f}{\partial y \partial z} - \frac{x y}{4} \frac{\partial^{2} f}{\partial z \partial z} + \frac{1}{2} \partial_{z} f]}^{2} \\ + {[\frac{\partial^{2} f}{\partial x \partial y} - \frac{y}{2} \frac{\partial^{2} f}{\partial y \partial z} + \frac{x}{2} \frac{\partial^{2} f}{\partial x \partial z} - \frac{x y}{4} \frac{\partial^{2} f}{\partial z \partial z} - \frac{1}{2} \partial_{z} f]}^{2} + {[\frac{\partial^{2} f}{\partial y \partial y} + x \frac{\partial^{2} f}{\partial y \partial z} + \frac{x^{2}}{4} \frac{\partial^{2} f}{\partial z \partial z}]}^{2} \\ + {[\frac{\partial^{2} f}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} f}{\partial z \partial z}]}^{2} + {[\frac{\partial^{2} f}{\partial y \partial z} + \frac{x}{2} \frac{\partial^{2} f}{\partial z \partial z}]}^{2} \\ + 2 {(a^{T} \nabla)}_{2} f [\frac{\partial^{2} f}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} f}{\partial z \partial z}] - 2 {(a^{T} \nabla)}_{1} f [\frac{\partial^{2} f}{\partial y \partial z} + \frac{x}{2} \frac{\partial^{2} f}{\partial z \partial z}] . \end{matrix}

Completing the squares for the cross terms involving the type of

“ \nabla f \nabla^{2} f

” and following the reformulation as below:

\begin{matrix} {[\frac{\partial^{2} f}{\partial x \partial y} + \frac{x}{2} \frac{\partial^{2} f}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} f}{\partial y \partial z} - \frac{x y}{4} \frac{\partial^{2} f}{\partial z \partial z} + \frac{1}{2} \partial_{z} f]}^{2} \\ + {[\frac{\partial^{2} f}{\partial x \partial y} - \frac{y}{2} \frac{\partial^{2} f}{\partial y \partial z} + \frac{x}{2} \frac{\partial^{2} f}{\partial x \partial z} - \frac{x y}{4} \frac{\partial^{2} f}{\partial z \partial z} - \frac{1}{2} \partial_{z} f]}^{2} \\ = & 2 {[\frac{\partial^{2} f}{\partial x \partial y} - \frac{y}{2} \frac{\partial^{2} f}{\partial y \partial z} + \frac{x}{2} \frac{\partial^{2} f}{\partial x \partial z} - \frac{x y}{4} \frac{\partial^{2} f}{\partial z \partial z}]}^{2} + \frac{1}{2} {| \partial_{z} f |}^{2}, \end{matrix}

we have

\begin{matrix} {[Q X + D]}^{T} [Q X + D] + {[P X + E]}^{T} [P X + E] + 2 C^{T} X \\ = & {[\frac{\partial^{2} f}{\partial x \partial x} - y \frac{\partial^{2} f}{\partial x \partial z} + \frac{y^{2}}{4} \frac{\partial^{2} f}{\partial z \partial z}]}^{2} + 2 {[\frac{\partial^{2} f}{\partial x \partial y} - \frac{y}{2} \frac{\partial^{2} f}{\partial y \partial z} + \frac{x}{2} \frac{\partial^{2} f}{\partial x \partial z} - \frac{x y}{4} \frac{\partial^{2} f}{\partial z \partial z}]}^{2} \\ + {[\frac{\partial^{2} f}{\partial y \partial y} + x \frac{\partial^{2} f}{\partial y \partial z} + \frac{x^{2}}{4} \frac{\partial^{2} f}{\partial z \partial z}]}^{2} + {[\frac{\partial^{2} f}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} f}{\partial z \partial z} + {(a^{T} \nabla)}_{2} f]}^{2} \\ + {[\frac{\partial^{2} f}{\partial y \partial z} + \frac{x}{2} \frac{\partial^{2} f}{\partial z \partial z} - {(a^{T} \nabla)}_{1} f]}^{2} - | {(a^{T} \nabla)}_{2} {f |}^{2} - | {(a^{T} \nabla)}_{1} {f |}^{2} + \frac{1}{2} {| {(z^{T} \nabla)}_{1} f |}^{2} . \end{matrix}

The sum of squares terms give

∥ {Hess}_{a, z} ∥_{F}^{2}

, hence

Λ_{1}

and

Λ_{2}

. The remainders generate

- Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E

, which equals

- Γ_{1} (f, f) + \frac{1}{2} Γ_{1}^{z} (f, f)

. □

We are now left to compute the tensors.

Proof of Lemma 3.

By direct computation, we have

\begin{matrix} R_{a} (\nabla f, \nabla f) & = & \sum_{i, k = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{2}} \\ + \sum_{i, k = 2}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{2}} \\ - \sum_{i, k = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{k \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(a^{T} \nabla)}_{k} {f 〉}_{R^{2}} \\ - \sum_{i, k = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{k \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}}, {(a^{T} \nabla)}_{k} f 〉_{R^{2}}, \\ = & I_{1} + I_{2} + I_{3} + I_{4} . \end{matrix}

For the four terms above, we have

\begin{matrix} I_{1} & = & \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial a_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}) {(a^{T} \nabla)}_{1} f \\ + \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial a_{2 \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}) {(a^{T} \nabla)}_{2} f = 0 \end{matrix}

\begin{matrix} I_{2} & = & \sum_{i = 2}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial a_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}) {(a^{T} \nabla)}_{1} f \\ + \sum_{i = 2}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial a_{2 \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}) {(a^{T} \nabla)}_{2} f = 0 \\ I_{3} & = & - \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} a_{1 \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}) {(a^{T} \nabla)}_{1} f \\ - \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} a_{2 \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}) {(a^{T} \nabla)}_{2} f = 0 \\ I_{4} & = & - \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} a_{1 \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{1} f \\ - \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} a_{2 \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{2} f = 0 . \end{matrix}

Similar computation applies to the tensor terms

R_{π}

and

R_{z b}

. Since z is a constant matrix, we obtain

\begin{matrix} R_{z b} (\nabla f, \nabla f) & = & - \sum_{\hat{i}, \hat{k} = 1}^{3} 〈 (z_{1 \hat{i}}^{T} \frac{\partial b_{\hat{k}}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}} - b_{\hat{k}} \frac{\partial z_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(z^{T} \nabla f)}_{1} 〉_{R}, \\ R_{π} & = & 0 . \end{matrix}

We now compute the tensor terms involving the drift b. For the drift term in tensor

R_{a b}

, taking

b = - a a^{T} \nabla V

, which means

b = - {(a_{\hat{k} k} a_{k k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}})}_{\hat{k} = 1, 2, 3}

in local coordinates,

\begin{matrix} R_{b}^{a} & = & \sum_{i, k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{i \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{k k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{i} f] \\ + \sum_{i, k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{i \hat{i}}^{T} \frac{\partial a_{k k^{'}}^{T}}{\partial x_{\hat{i}}} a_{k \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{i} f] \\ + \sum_{i, k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{i \hat{i}}^{T} a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{i} f] \\ - \sum_{i, k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{i} f] \\ = & J_{1} + J_{2} + J_{3} + J_{4} . \end{matrix}

We now derive the explicit formulas for the above four terms.

\begin{matrix} J_{1} & = & \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} \frac{\partial a_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{1 k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} \frac{\partial a_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{1 k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ + \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} \frac{\partial a_{2 \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{2 k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} \frac{\partial a_{2 \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{2 k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ = & - \frac{1}{2} {(a^{T} \nabla)}_{1} V \partial_{z} f {(a^{T} \nabla)}_{2} f + \frac{1}{2} {(a^{T} \nabla)}_{2} V \partial_{z} f {(a^{T} \nabla)}_{1} f; \end{matrix}

\begin{matrix} J_{2} & = & \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} \frac{\partial a_{1 k^{'}}^{T}}{\partial x_{\hat{i}}} a_{1 \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} \frac{\partial a_{1 k^{'}}^{T}}{\partial x_{\hat{i}}} a_{1 \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} \frac{\partial a_{2 k^{'}}^{T}}{\partial x_{\hat{i}}} a_{2 \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} \frac{\partial a_{2 k^{'}}^{T}}{\partial x_{\hat{i}}} a_{2 \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ = & - \frac{1}{2} \frac{\partial V}{\partial z} {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f + \frac{1}{2} \frac{\partial V}{\partial z} {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f = 0; \\ J_{3} & = & \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} a_{1 \hat{k}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} a_{1 \hat{k}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} a_{2 \hat{k}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} a_{2 \hat{k}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ = & \sum_{\hat{i}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {| {(a^{T} \nabla)}_{1} f |}^{2} + a_{2 \hat{i}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f] \\ \sum_{\hat{i}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {(a^{T} \nabla)}_{2} f {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {| {(a^{T} \nabla)}_{2} f |}^{2}] \\ = & [\frac{\partial^{2} V}{\partial x \partial x} + \frac{y^{2}}{4} \frac{\partial^{2} V}{\partial z \partial z} - y \frac{\partial^{2} V}{\partial x \partial z}] {| {(a^{T} \nabla)}_{1} f |}^{2} \\ + [\frac{\partial^{2} V}{\partial y \partial y} + \frac{x^{2}}{4} \frac{\partial^{2} V}{\partial z \partial z} + x \frac{\partial^{2} V}{\partial y \partial z}] {| {(a^{T} \nabla)}_{2} f |}^{2} \\ + 2 [\frac{\partial^{2} V}{\partial x \partial y} + \frac{x}{2} \frac{\partial^{2} V}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} V}{\partial y \partial z} - \frac{x y}{4} \frac{\partial^{2} V}{\partial z \partial z}] {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f; \\ J_{4} & = & - \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{k}}^{T} a_{1 k^{'}}^{T} \frac{\partial a_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{1} f + a_{1 \hat{k}}^{T} a_{1 k^{'}}^{T} \frac{\partial a_{2 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{2} f] \\ - \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{2 \hat{k}}^{T} a_{2 k^{'}}^{T} \frac{\partial a_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{k}}^{T} a_{2 k^{'}}^{T} \frac{\partial a_{2 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{2} f] \\ = & - \frac{1}{2} {(a^{T} \nabla)}_{1} V \partial_{z} f {(a^{T} \nabla)}_{2} f + \frac{1}{2} {(a^{T} \nabla)}_{2} V \partial_{z} f {(a^{T} \nabla)}_{1} f . \end{matrix}

Summing up the above formulas, we obtain

R_{a b}

. We now compute the drift tensor term of

R_{z b}

. By taking

b = - a a^{T} \nabla V

, we have

\begin{matrix} R_{b}^{z} (\nabla f, \nabla f) & = & - \sum_{\hat{i}, \hat{k} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial b_{\hat{k}}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla f)}_{1} - b_{\hat{k}} \frac{\partial z_{i \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}} {(z^{T} \nabla f)}_{1}] \\ = & \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{k k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \\ + \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial a_{k k^{'}}^{T}}{\partial x_{\hat{i}}} a_{k \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \\ + \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \\ - \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial z_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(z^{T} \nabla)}_{1} f] \\ = & J_{1}^{z} + J_{2}^{z} + J_{3}^{z} + J_{4}^{z} . \end{matrix}

We further compute as below by taking advantage of the constant matrix z:

\begin{matrix} J_{1}^{z} & = & \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{k k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] = 0; \\ J_{2}^{z} & = & \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial a_{k k^{'}}^{T}}{\partial x_{\hat{i}}} a_{k \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] = 0; \\ J_{4}^{z} & = & - \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial z_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(z^{T} \nabla)}_{1} f] = 0 \\ J_{3}^{z} & = & \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \\ = & (\frac{\partial^{2} V}{\partial x \partial z} - \frac{y}{2} \frac{\partial^{2} V}{\partial z \partial z}) {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f + (\frac{\partial^{2} V}{\partial y \partial z} + \frac{x}{2} \frac{\partial^{2} V}{\partial z \partial z}) {(z^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f . \end{matrix}

The proof is thus completed. □

3.2. Displacement Group

In this section, we derive the generalized curvature dimension bound for the displacement group, which is one example of three-dimensional solvable Lie groups. We adapted the general setting from [49] below. Denote

g

as the three-dimensional solvable Lie algebra, and denote

H \subset g

as the horizontal subspace satisfying Hörmander’s condition, then for a given inner product

〈 \cdot, \cdot 〉

on H, there exists a canonical basis

{X, Y, Z}

for

(g, H, 〈 \cdot, \cdot 〉)

, such that

{X, Y}

forms an orthonormal basis for H and satisfies the following Lie-bracket-generating condition for parameters

α

and

β \geq 0

:

[X, Y] = Z, [X, Z] = α Y + β Z, [Y, Z] = 0 .

When the parameters

α = 0

and

β \neq 0

, the Lie algebra

g

has a faithful representation. In particular, it was shown in [49] that the elements of

g

, in local coordinates

(θ, x, y)

, correspond to the following left-invariant differential operators:

\begin{matrix} X = \frac{\partial}{\partial θ}, Y = e^{β θ} \frac{\partial}{\partial x} + \frac{\partial}{\partial y}, R = - β \frac{\partial}{\partial y}, \end{matrix}

with the following relation:

\begin{matrix} [X, Y] = β Y + R, [X, R] = 0, [Y, R] = 0 . \end{matrix}

In terms of local coordinates

(θ, x, y)

, we have

X = (\begin{matrix} 1 \\ 0 \\ 0 \end{matrix}), Y = (\begin{matrix} 0 \\ e^{β θ} \\ 1 \end{matrix}), R = (\begin{matrix} 0 \\ 0 \\ - β \end{matrix}) .

The corresponding Lie group of this special Lie algebra

g

is called the displacement group, denoted as

G

. We chose

{X, Y}

as the horizontal orthonormal basis for subalgebra H. To fit into the general framework from the previous section, we take

a = (X, Y) = (\begin{matrix} 1 & 0 \\ 0 & e^{β θ} \\ 0 & 1 \end{matrix}), a^{T} = (\begin{matrix} 1 & 0 & 0 \\ 0 & e^{β θ} & 1 \end{matrix}), z^{T} = (\begin{matrix} 0 & 0 & - g (θ, x, y) \end{matrix}),

(26)

with

g (θ, x, y) \neq 0

. Our focus here is to derive the curvature tensor in terms of

π = \frac{1}{Z} e^{- V}

. We then used

{(a a^{T})}_{| H}^{†}

as the horizontal metric on H. Thus, the sub-Riemannian structure is given by

(G, H, {(a a^{T})}_{| H}^{†})

. By direct computations, it is easy to show that, for general smooth function f,

Γ_{1} (f, Γ_{1}^{z} (f, f)) \neq Γ_{1}^{z} (f, Γ_{1} (f, f))

. Hence, the classical Gamma z calculus proposed in [2] can not be extended for this case to derive the zLSI. Thus, we need to compute vector G and the tensor term

R_{π}

. Following Theorem 1, we have the following z-Bochner’s formula for

G

.

Proposition 3.

For any smooth function

f \in C^{\infty} (G)

, one has

\begin{matrix} Γ_{2} (f, f) + Γ_{2}^{z, π} (f, f) & = & ∥ {Hess}_{a, z} {f ∥}^{2} + R (\nabla f, \nabla f), \end{matrix}

where

\begin{matrix} Λ_{1}^{T} & = & (0, β \partial_{x} f, \frac{β \partial_{y} f}{2}, β \partial_{x} f, 0, 0, \frac{β \partial_{y} f}{2}, 0, - β \partial_{θ} f); \\ Λ_{2}^{T} & = & (0, 0, 0, 0, 0, 0, λ_{6}, 0, λ_{9}); \\ λ_{6} & = & \frac{\partial_{θ} g \partial_{y} f}{g} - \frac{β {(a^{T} \nabla)}_{2} f}{g^{2}} - \frac{\partial_{θ} g \partial_{y} f}{g}; \\ λ_{9} & = & \frac{{(a^{T} \nabla)}_{2} g \partial_{y} f}{g} + \frac{β \partial_{θ} f}{g^{2}} - \frac{{(a^{T} \nabla)}_{2} g \partial_{y} f}{g}; \end{matrix}

and

\begin{matrix} R_{a b} (\nabla f, \nabla f) - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E) \\ = & Γ_{1} (log g, log g) Γ_{1}^{z} (f, f) - β^{2} (1 + \frac{1}{g^{2}}) Γ_{1} (f, f) + \frac{β^{2}}{2 g^{2}} Γ_{1}^{z} (f, f) \\ + β^{2} e^{β θ} \frac{\partial f}{\partial x} {(a^{T} \nabla)}_{2} f + β e^{β θ} {(a^{T} \nabla)}_{2} V \frac{\partial f}{\partial x} {(a^{T} \nabla)}_{1} f \\ + β e^{β θ} \frac{\partial V}{\partial x} {(a^{T} \nabla)}_{2} f {(a^{T} \nabla)}_{1} f \\ + \frac{\partial^{2} V}{\partial θ \partial θ} {| {(a^{T} \nabla)}_{1} f |}^{2} + 2 (e^{β θ} \frac{\partial^{2} V}{\partial θ \partial x} + \frac{\partial^{2} V}{\partial θ \partial y}) {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f \\ + \sum_{\hat{i}, k^{'} = 1}^{3} a_{2 \hat{i}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}}) | {(a^{T} \nabla)}_{2} {f |}^{2} - β e^{β θ} {(a^{T} \nabla)}_{1} V \frac{\partial f}{\partial x} {(a^{T} \nabla)}_{2} f; \\ R_{z b} (\nabla f, \nabla f) \\ = & \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i} = 1}^{3} a_{i i^{'}}^{T} a_{i \hat{i}}^{T} \frac{\partial^{2} z_{13}^{T}}{\partial x_{i^{'}} \partial x_{\hat{i}}} \partial_{y} f {(z^{T} \nabla)}_{1} f - \sum_{k = 1}^{2} {(a^{T} \nabla)}_{k} z_{13}^{T} {(a^{T} \nabla)}_{k} V \partial_{y} f {(z^{T} \nabla)}_{1} f \\ - g \frac{\partial^{2} V}{\partial θ \partial y} {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f - g (e^{β θ} \frac{\partial^{2} V}{\partial x \partial y} + \frac{\partial^{2} V}{\partial y \partial y}) {(a^{T} \nabla)}_{2} f {(z^{T} \nabla)}_{1} f; \\ R_{π} (\nabla f, \nabla f) \\ = & - 2 \sum_{l = 1}^{2} \sum_{l^{'}, \hat{l} = 1}^{3} a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial^{2} z_{13}^{T}}{\partial x_{l^{'}} \partial x_{\hat{l}}} \partial_{y} f {(z^{T} \nabla)}_{1} f \\ - 2 Γ_{1} (log π, log g) | {(z^{T} \nabla)}_{1} {f |}^{2} - 2 Γ_{1} (log g, log g) {| {(z^{T} \nabla)}_{1} f |}^{2} . \end{matrix}

In particular, we have

\begin{matrix} \sum_{\hat{i}, k^{'} = 1}^{3} a_{2 \hat{i}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {| {(a^{T} \nabla)}_{2} f |}^{2} \\ = & [e^{2 β θ} \frac{\partial^{2} V}{\partial x \partial x} + 2 e^{β θ} \frac{\partial^{2} V}{\partial x \partial y} + \frac{\partial^{2} V}{\partial y \partial y}] {| {(a^{T} \nabla)}_{2} f |}^{2}; \end{matrix}

\begin{matrix} \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i} = 1}^{3} a_{i i^{'}}^{T} a_{i \hat{i}}^{T} \frac{\partial^{2} z_{13}^{T}}{\partial x_{i^{'}} \partial x_{\hat{i}}} \partial_{y} f {(z^{T} \nabla)}_{1} f \\ = & [\frac{\partial^{2} g}{\partial θ \partial θ} + e^{2 β θ} \frac{\partial^{2} g}{\partial x \partial x} + \frac{\partial^{2} g}{\partial y \partial y} + 2 e^{β θ} \frac{\partial^{2} g}{\partial x \partial y}] \frac{| {(z^{T} \nabla)}_{1} {f |}^{2}}{g} . \end{matrix}

The proof of Proposition 3 follows from the proof of Theorem 1 (i.e., Theorem 3) and Lemmas 4–6 below. The following convergence result follows directly from Theorem 2.

Proposition 4.

If there exists

κ > 0

as shown in Theorem 2, the exponential dissipation result in the

L^{1}

distance holds:

\int | ρ (t, x) - π (x) | d x = O (e^{- κ t}) .

Similarly, we formulated the curvature tensor into a matrix format of

R

. Using the fact

e^{β θ} \frac{\partial f}{\partial x} = {(a^{T} \nabla)}_{2} f + \frac{1}{g} {(z^{T} \nabla f)}_{1} f

, we have the following representation.

Corollary 2.

The matrix

R

associated with

G

has the following representation:

\begin{matrix} R_{11} & = & \frac{\partial^{2} V}{\partial θ \partial θ} - β^{2} (1 + \frac{1}{g^{2}}); \\ R_{22} & = & [e^{2 β θ} \frac{\partial^{2} V}{\partial x \partial x} + 2 e^{β θ} \frac{\partial^{2} V}{\partial x \partial y} + \frac{\partial^{2} V}{\partial y \partial y}] - \frac{β^{2}}{g^{2}} - β {(a^{T} \nabla)}_{1} V; \\ R_{33} & = & \frac{β^{2}}{2 g^{2}} - Γ_{1} (log g, log g) - 2 Γ_{1} (log π, log g) - Γ_{1} (log g, V) \\ - \frac{1}{g} [\frac{\partial^{2} g}{\partial θ \partial θ} + e^{2 β θ} \frac{\partial^{2} g}{\partial x \partial x} + \frac{\partial^{2} g}{\partial y \partial y} + 2 e^{β θ} \frac{\partial^{2} g}{\partial x \partial y}]; \\ R_{12} & = & R_{21} = \frac{1}{2} (β e^{β θ} \frac{\partial V}{\partial x} + 2 (e^{β θ} \frac{\partial^{2} V}{\partial θ \partial x} + \frac{\partial^{2} V}{\partial θ \partial y}) + β {(a^{T} \nabla)}_{2} V); \\ R_{13} & = & R_{31} = \frac{1}{2} (\frac{β}{g} {(a^{T} \nabla)}_{2} V - g \frac{\partial^{2} V}{\partial θ \partial y}); \\ R_{23} & = & R_{32} = - \frac{1}{2} (\frac{β}{g} {(a^{T} \nabla)}_{1} - \frac{β^{2}}{g}) - \frac{1}{2} g (e^{β θ} \frac{\partial^{2} V}{\partial x \partial y} + \frac{\partial^{2} V}{\partial y \partial y}) . \end{matrix}

Proof.

The derivation for the explicit form of matrix

R

follows from a similar equivalent representation as shown in the proof of Corollary 1 and the explicit bilinear terms derived in Proposition 3. □

Remark 9.

By taking

g (θ, x, y) = β

as a constant, Proposition 3 reduces to a simple version; in particular, the tensors reduce to be

\begin{matrix} R_{a b} (\nabla f, \nabla f) - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E) \\ = & - (1 + β^{2}) Γ_{1} (f, f) + \frac{1}{2} Γ_{1}^{z} (f, f) \\ + β^{2} e^{β θ} \frac{\partial f}{\partial x} {(a^{T} \nabla)}_{2} f + β e^{β θ} {(a^{T} \nabla)}_{2} V \frac{\partial f}{\partial x} {(a^{T} \nabla)}_{1} f + β e^{β θ} \frac{\partial V}{\partial x} {(a^{T} \nabla)}_{2} f {(a^{T} \nabla)}_{1} f \\ + \frac{\partial^{2} V}{\partial θ \partial θ} {| {(a^{T} \nabla)}_{1} f |}^{2} + 2 (e^{β θ} \frac{\partial^{2} V}{\partial θ \partial x} + \frac{\partial^{2} V}{\partial θ \partial y}) {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f \\ + \sum_{\hat{i}, k^{'} = 1}^{3} a_{2 \hat{i}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}}) | {(a^{T} \nabla)}_{2} {f |}^{2} - β e^{β θ} {(a^{T} \nabla)}_{1} V \frac{\partial f}{\partial x} {(a^{T} \nabla)}_{2} f; \end{matrix}

\begin{matrix} R_{z b} (\nabla f, \nabla f) & = & - β \frac{\partial^{2} V}{\partial θ \partial y} {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f \\ - β (e^{β θ} \frac{\partial^{2} V}{\partial x \partial y} + \frac{\partial^{2} V}{\partial y \partial y}) {(a^{T} \nabla)}_{2} f {(z^{T} \nabla)}_{1} f; \\ R_{π} (\nabla f, \nabla f) & = & 0 . \end{matrix}

Next, we present the following three key lemmas.

Lemma 4.

For displacement group

G

, we have

\begin{matrix} Q & = & (\begin{matrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & e^{β θ} & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & e^{β θ} & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & e^{2 β θ} & e^{β θ} & 0 & e^{β θ} & 1 \end{matrix}); \\ P & = & (\begin{matrix} 0 & 0 & 0 & 0 & 0 & 0 & - g (θ, x, y) & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & - g (θ, x, y) e^{β θ} & - g (θ, x, y) \end{matrix}); \\ D^{T} & = & (0, β e^{β θ} \partial_{x} f, 0, 0), E^{T} = (- \partial_{y} f \partial_{θ} g, - \partial_{y} f \partial_{y} g - e^{β θ} \partial_{y} f \partial_{x} g); \\ C^{T} & = & (0, β e^{β θ} \partial_{y} f + β e^{2 β θ} \partial_{x} f, 0, 0, - β e^{2 β θ} \partial_{θ} f, - β e^{β θ} \partial_{θ} f, 0, 0, 0) . \end{matrix}

F = (\begin{matrix} 0 \\ 0 \\ g \partial_{θ} g \partial_{y} f \\ 0 \\ 0 \\ e^{β θ} g \partial_{y} f \partial_{y} g + e^{2 β θ} g \partial_{y} f \partial_{x} g \\ 0 \\ 0 \\ g \partial_{y} f \partial_{y} g + e^{β θ} g \partial_{y} f \partial_{x} g \end{matrix}), G = (\begin{matrix} 0 \\ 0 \\ - 2 g \partial_{y} f \partial_{θ} g \\ 0 \\ 0 \\ - 2 e^{β θ} g \partial_{y} f \partial_{y} g - 2 e^{2 β θ} g \partial_{y} f \partial_{x} g \\ 0 \\ 0 \\ - 2 g \partial_{y} f \partial_{y} g - 2 e^{β θ} g \partial_{y} f \partial_{x} g \end{matrix})

.

Proof.

The proof follows by plugging matrices a and z from (26) into Notation 1. □

Lemma 5.

On displacement group

G

, we have

\begin{matrix} {[Q X + D]}^{T} [Q X + D] + {[P X + E]}^{T} [P X + E] + 2 [C^{T} + F^{T} + G^{T}] X \\ = & ∥ {Hess}_{a, z} {f ∥}^{2} - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E . \end{matrix}

In particular, we have

\begin{matrix} ∥ {Hess}_{a, z} {f ∥}^{2} & = & {[X + Λ_{1}]}^{T} Q^{T} Q [X + Λ_{1}] + {[X + Λ_{2}]}^{T} P^{T} P [X + Λ_{2}]; \\ Λ_{1}^{T} & = & (0, β \partial_{x} f, \frac{β \partial_{y} f}{2}, β \partial_{x} f, 0, 0, \frac{β \partial_{y} f}{2}, 0, - β \partial_{θ} f); \\ Λ_{2}^{T} & = & (0, 0, 0, 0, 0, 0, λ_{6}, 0, λ_{9}); \\ λ_{6} & = & \frac{\partial_{θ} g \partial_{y} f}{g} - \frac{β {(a^{T} \nabla)}_{2} f}{g^{2}} - \frac{\partial_{θ} g \partial_{y} f}{g}; \end{matrix}

\begin{matrix} λ_{9} & = & \frac{{(a^{T} \nabla)}_{2} g \partial_{y} f}{g} + \frac{β \partial_{θ} f}{g^{2}} - \frac{{(a^{T} \nabla)}_{2} g \partial_{y} f}{g}; \\ - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E \\ = & Γ_{1} (log g, log g) Γ_{1}^{z} (f, f) - β^{2} (1 + \frac{1}{g^{2}}) Γ_{1} (f, f) + \frac{β^{2}}{2 g^{2}} Γ_{1}^{z} (f, f) . \end{matrix}

Lemma 6.

By routine computations, we obtain

\begin{matrix} R_{a b} (\nabla f, \nabla f) \\ = & β^{2} e^{β θ} \frac{\partial f}{\partial x} {(a^{T} \nabla)}_{2} f + β e^{β θ} {(a^{T} \nabla)}_{2} V \frac{\partial f}{\partial x} {(a^{T} \nabla)}_{1} f + β e^{β θ} \frac{\partial V}{\partial x} {(a^{T} \nabla)}_{2} f {(a^{T} \nabla)}_{1} f \\ + \frac{\partial^{2} V}{\partial θ \partial θ} {| {(a^{T} \nabla)}_{1} f |}^{2} + 2 (e^{β θ} \frac{\partial^{2} V}{\partial θ \partial x} + \frac{\partial^{2} V}{\partial θ \partial y}) {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f \\ + \sum_{\hat{i}, k^{'} = 1}^{3} a_{2 \hat{i}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {| {(a^{T} \nabla)}_{2} f |}^{2} - β e^{β θ} {(a^{T} \nabla)}_{1} V \frac{\partial f}{\partial x} {(a^{T} \nabla)}_{2} f; \\ R_{z b} (\nabla f, \nabla f) \\ = & \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i} = 1}^{3} a_{i i^{'}}^{T} a_{i \hat{i}}^{T} \frac{\partial^{2} z_{1 \hat{k}}^{T}}{\partial x_{i^{'}} \partial x_{\hat{i}}} \partial_{y} f {(z^{T} \nabla)}_{1} f - \sum_{k = 1}^{2} {(a^{T} \nabla)}_{k} z_{13}^{T} {(a^{T} \nabla)}_{k} V \partial_{y} f {(z^{T} \nabla)}_{1} f \\ - g \frac{\partial^{2} V}{\partial θ \partial y} {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f - g (e^{β θ} \frac{\partial^{2} V}{\partial x \partial y} + \frac{\partial^{2} V}{\partial y \partial y}) {(a^{T} \nabla)}_{2} f {(z^{T} \nabla)}_{1} f; \\ R_{π} (\nabla f, \nabla f) \\ = & - 2 \sum_{l = 1}^{2} \sum_{l^{'}, \hat{l} = 1}^{3} a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial^{2} z_{13}^{T}}{\partial x_{l^{'}} \partial x_{\hat{l}}} \partial_{y} f {(z^{T} \nabla)}_{1} f - 2 \sum_{l = 1}^{2} \sum_{l^{'}, \hat{l} = 1}^{3} a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial z_{13}^{T}}{\partial x_{\hat{l}}} \frac{\partial z_{13}^{T}}{\partial x_{l^{'}}} {| \partial_{y} f |}^{2} \\ - 2 \sum_{l = 1}^{2} \sum_{\hat{l} = 1}^{3} {(a^{T} \nabla)}_{l} log π a_{l \hat{l}}^{T} \frac{\partial z_{13}^{T}}{\partial x_{\hat{l}}} \partial_{y} f {(z^{T} \nabla)}_{1} f . \end{matrix}

Proof of Lemma 5.

According to Lemma 4 and observing the fact that

G = - 2 F

and

{(a^{T} \nabla)}_{2} f = e^{β θ} \partial_{x} f + \partial_{y} f

, we first have

\begin{matrix} 2 C^{T} X & = & 2 [β e^{β θ} \partial_{y} f + β e^{2 β θ} \partial_{x} f] \frac{\partial^{2} f}{\partial θ \partial x} + 2 [- β e^{2 β θ} \partial_{θ} f] \frac{\partial^{2} f}{\partial x \partial x} + 2 [- β e^{β θ} \partial_{θ} f] \frac{\partial^{2} f}{\partial x \partial y}; \\ 2 [F^{T} + G^{T}] X & = & - 2 (g \partial_{θ} g \partial_{y} f \frac{\partial^{2} f}{\partial θ \partial y} + e^{β θ} g {(a^{T} \nabla)}_{2} g \partial_{y} f \frac{\partial^{2} f}{\partial x \partial y} + g {(a^{T} \nabla)}_{2} g \partial_{y} f \frac{\partial^{2} f}{\partial y \partial y}) . \end{matrix}

By direct computations, we have

\begin{matrix} {[Q X + D]}^{T} [Q X + D] + {[P X + E]}^{T} [P X + E] + 2 C^{T} X + 2 F^{T} X + 2 G^{T} X \\ = & {[\frac{\partial^{2} f}{\partial θ \partial θ}]}^{2} + {[e^{2 β θ} \frac{\partial^{2} f}{\partial x \partial x} + 2 e^{β θ} \frac{\partial^{2} f}{\partial x \partial y} + \frac{\partial^{2} f}{\partial y \partial y}]}^{2} + {[e^{β θ} \frac{\partial^{2} f}{\partial θ \partial x} + \frac{\partial^{2} f}{\partial θ \partial y} + β e^{β θ} \frac{\partial f}{\partial x}]}^{2} \\ + {[e^{β θ} \frac{\partial^{2} f}{\partial θ \partial x} + \frac{\partial^{2} f}{\partial θ \partial y}]}^{2} + {[- g \frac{\partial^{2} f}{\partial θ \partial y} - \partial_{y} f \partial_{θ} g]}^{2} \\ + {[- g e^{β θ} \frac{\partial^{2} f}{\partial x \partial y} - g \frac{\partial^{2} f}{\partial y \partial y} - {(a^{T} \nabla)}_{2} g \partial_{y} f]}^{2} \\ + 2 [β e^{β θ} \partial_{y} f + β e^{2 β θ} \partial_{x} f] \frac{\partial^{2} f}{\partial θ \partial x} + 2 [- β e^{2 β θ} \partial_{θ} f] \frac{\partial^{2} f}{\partial x \partial x} + 2 [- β e^{β θ} \partial_{θ} f] \frac{\partial^{2} f}{\partial x \partial y} \\ - 2 (g \partial_{θ} g \partial_{y} f \frac{\partial^{2} f}{\partial θ \partial y} + e^{β θ} g {(a^{T} \nabla)}_{2} g \partial_{y} f \frac{\partial^{2} f}{\partial x \partial y} + g {(a^{T} \nabla)}_{2} g \partial_{y} f \frac{\partial^{2} f}{\partial y \partial y}) \\ = & {[\frac{\partial^{2} f}{\partial θ \partial θ}]}^{2} + {[e^{2 β θ} \frac{\partial^{2} f}{\partial x \partial x} + 2 e^{β θ} \frac{\partial^{2} f}{\partial x \partial y} + \frac{\partial^{2} f}{\partial y \partial y}]}^{2} + {[e^{β θ} \frac{\partial^{2} f}{\partial θ \partial x} + \frac{\partial^{2} f}{\partial θ \partial y} + β e^{β θ} \frac{\partial f}{\partial x}]}^{2} \end{matrix}

\begin{matrix} + {[e^{β θ} \frac{\partial^{2} f}{\partial θ \partial x} + \frac{\partial^{2} f}{\partial θ \partial y}]}^{2} + {[- g \frac{\partial^{2} f}{\partial θ \partial y} - \partial_{y} f \partial_{θ} g]}^{2} \\ + {[- g e^{β θ} \frac{\partial^{2} f}{\partial x \partial y} - g \frac{\partial^{2} f}{\partial y \partial y} - {(a^{T} \nabla)}_{2} g \partial_{y} f]}^{2} \\ + 2 β {(a^{T} \nabla)}_{2} f [e^{β θ} \frac{\partial^{2} f}{\partial θ \partial x} + \frac{\partial^{2} f}{\partial θ \partial y}] - 2 β {(a^{T} \nabla)}_{2} f \frac{\partial^{2} f}{\partial θ \partial y} - 2 g \partial_{θ} g \partial_{y} f \frac{\partial^{2} f}{\partial θ \partial y} \\ - 2 β \partial_{θ} f [2 e^{β θ} \frac{\partial^{2} f}{\partial x \partial y} + e^{2 β θ} \frac{\partial^{2} f}{\partial x \partial x} + \frac{\partial^{2} f}{\partial y \partial y}] + 2 β \partial_{θ} f [e^{β θ} \frac{\partial^{2} f}{\partial x \partial y} + \frac{\partial^{2} f}{\partial y \partial y}] \\ - 2 g {(a^{T} \nabla)}_{2} g \partial_{y} f [e^{β θ} \frac{\partial^{2} f}{\partial x \partial y} + \frac{\partial^{2} f}{\partial y \partial y}] . \end{matrix}

Completing the squares for the above terms, we have

\begin{matrix} {[Q X + D]}^{T} [Q X + D] + {[P X + E]}^{T} [P X + E] + 2 C^{T} X + 2 F^{T} X + 2 G^{T} X \\ = & {[\frac{\partial^{2} f}{\partial θ \partial θ}]}^{2} + {[e^{2 β θ} \frac{\partial^{2} f}{\partial x \partial x} + 2 e^{β θ} \frac{\partial^{2} f}{\partial x \partial y} + \frac{\partial^{2} f}{\partial y \partial y} - β \partial_{θ} f]}^{2} - β^{2} {| \partial_{θ} f |}^{2} \\ + {[e^{β θ} \frac{\partial^{2} f}{\partial θ \partial x} + \frac{\partial^{2} f}{\partial θ \partial y} + β e^{β θ} \frac{\partial f}{\partial x}]}^{2} + {[e^{β θ} \frac{\partial^{2} f}{\partial θ \partial x} + \frac{\partial^{2} f}{\partial θ \partial y} + β {(a^{T} \nabla)}_{2} f]}^{2} - β^{2} {| {(a^{T} \nabla)}_{2} f |}^{2} \\ + {[g \frac{\partial^{2} f}{\partial θ \partial y} + \partial_{θ} g \partial_{y} f - \frac{β {(a^{T} \nabla)}_{2} f}{g} - \partial_{θ} g \partial_{y} f]}^{2} - {[\frac{β {(a^{T} \nabla)}_{2} f}{g} + \partial_{θ} g \partial_{y} f]}^{2} \\ + {[g e^{β θ} \frac{\partial^{2} f}{\partial x \partial y} + g \frac{\partial^{2} f}{\partial y \partial y} + {(a^{T} \nabla)}_{2} g \partial_{y} f + \frac{β \partial_{θ} f}{g} - {(a^{T} \nabla)}_{2} g \partial_{y} f]}^{2} \\ - {[\frac{β \partial_{θ} f}{g} - {(a^{T} \nabla)}_{2} g \partial_{y} f]}^{2} + 2 [\frac{β {(a^{T} \nabla)}_{2} f}{g} + \partial_{θ} g \partial_{y} f] \partial_{θ} g \partial_{y} f \\ - 2 \partial_{y} f {(a^{T} \nabla)}_{2} g \times [\frac{β \partial_{θ} f}{g} - {(a^{T} \nabla)}_{2} g \partial_{y} f] . \end{matrix}

The first-order terms generate

- Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E

, and the sum of squares terms generate vectors

Λ_{1}

and

Λ_{2}

. We further formulate the above two terms as below:

\begin{matrix} {[e^{β θ} \frac{\partial^{2} f}{\partial θ \partial x} + \frac{\partial^{2} f}{\partial θ \partial y} + β e^{β θ} \frac{\partial f}{\partial x}]}^{2} + {[e^{β θ} \frac{\partial^{2} f}{\partial θ \partial x} + \frac{\partial^{2} f}{\partial θ \partial y} + β {(a^{T} \nabla)}_{2} f]}^{2} \\ = & 2 {[e^{β θ} \frac{\partial^{2} f}{\partial θ \partial x} + \frac{\partial^{2} f}{\partial θ \partial y} + β e^{β θ} \frac{\partial f}{\partial x} + \frac{β}{2} \partial_{y} f]}^{2} + \frac{β^{2}}{2} {| \partial_{y} f |}^{2} . \end{matrix}

Adding

\frac{β^{2}}{2} {| \partial_{y} f |}^{2}

into the term

- Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E

again, we further expand as below:

\begin{matrix} - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E \\ = & - β^{2} [| \partial_{θ} {f |}^{2} + | {(a^{T} \nabla)}_{2} f |^{2}] - {[\frac{β \partial_{θ} f}{g} - {(a^{T} \nabla)}_{2} g \partial_{y} f]}^{2} \\ - {[\frac{β {(a^{T} \nabla)}_{2} f}{g} + \partial_{θ} g \partial_{y} f]}^{2} + 2 [\frac{β {(a^{T} \nabla)}_{2} f}{g} \\ + \partial_{θ} g \partial_{y} f] \partial_{θ} g \partial_{y} f - 2 \partial_{y} f {(a^{T} \nabla)}_{2} g \times [\frac{β \partial_{θ} f}{g} - {(a^{T} \nabla)}_{2} g \partial_{y} f] + \frac{β^{2}}{2} {| \partial_{y} f |}^{2} \end{matrix}

\begin{matrix} = & - β^{2} Γ_{1} (f, f) - \frac{β^{2}}{g^{2}} | {(a^{T} \nabla)}_{1} {f |}^{2} - | {(a^{T} \nabla)}_{2} (log g) |^{2} {| {(z^{T} \nabla)}_{1} f |}^{2} \\ - 2 \frac{β}{g} {(a^{T} \nabla)}_{2} log g {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f \\ - \frac{β^{2}}{g^{2}} | {(a^{T} \nabla)}_{2} {f |}^{2} - | {(a^{T} \nabla)}_{1} {log g |}^{2} {| {(z^{T} \nabla)}_{1} f |}^{2} \\ + 2 \frac{β}{g} {(a^{T} \nabla)}_{1} log g {(a^{T} \nabla)}_{2} f {(z^{T} \nabla)}_{1} f \\ - 2 \frac{β}{g} {(a^{T} \nabla)}_{1} log g {(a^{T} \nabla)}_{2} f {(z^{T} \nabla)}_{1} f + 2 | {(a^{T} \nabla)}_{1} {log g |}^{2} {| {(z^{T} \nabla)}_{1} f |}^{2} \\ + 2 \frac{β}{g} {(a^{T} \nabla)}_{2} log g {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f + 2 | {(a^{T} \nabla)}_{2} {log g |}^{2} {| {(z^{T} \nabla)}_{1} f |}^{2} + \frac{β^{2}}{2 g^{2}} Γ_{1}^{z} (f, f) . \end{matrix}

By grouping the bilinear terms of

\nabla f

, we obtain

\begin{matrix} - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E \\ = & Γ_{1} (log g, log g) Γ_{1}^{z} (f, f) - β^{2} (1 + \frac{1}{g^{2}}) Γ_{1} (f, f) + \frac{β^{2}}{2 g^{2}} Γ_{1}^{z} (f, f) . \end{matrix}

□

We are now left to compute the three tensor terms.

Proof of Lemma 6.

For displacement group

G

, we have

n = 2

and

m = 1

. Recall Theorem 1; we denote

R_{a b} (\nabla f, \nabla f) = R_{a} (\nabla f, \nabla f) + R_{b} (\nabla f, \nabla f)

, where

R_{b} (\nabla f, \nabla f)

represents the tensor term involving drift b. We thus have

\begin{matrix} R_{a} (\nabla f, \nabla f) & = & \sum_{i, k = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{2}} \\ + \sum_{i, k = 2}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{2}} \\ - \sum_{i, k = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{k \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(a^{T} \nabla)}_{k} {f 〉}_{R^{2}} \\ - \sum_{i, k = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{k \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}}, {(a^{T} \nabla)}_{k} f 〉_{R^{2}}, \\ = & I_{1} + I_{2} + I_{3} + I_{4} . \end{matrix}

By direct computations, we have

\begin{matrix} I_{1} & = & \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} [a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial a_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}) {(a^{T} \nabla)}_{1} f \\ + a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial a_{2 \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}) {(a^{T} \nabla)}_{2} f] = 0; \end{matrix}

\begin{matrix} I_{2} & = & \sum_{i = 2}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} [a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial a_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}) {(a^{T} \nabla)}_{1} f \\ + a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial a_{2 \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}) {(a^{T} \nabla)}_{2} f] \\ = & a_{11}^{T} a_{11}^{T} \frac{\partial^{2}}{\partial θ \partial θ} a_{22}^{T} \frac{\partial f}{\partial x} {(a^{T} \nabla)}_{2} f = β^{2} e^{β θ} \frac{\partial f}{\partial x} {(a^{T} \nabla)}_{2} f; \\ I_{3} & = & - \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} [a_{1 \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}) {(a^{T} \nabla)}_{1} f \\ + a_{2 \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}) {(a^{T} \nabla)}_{2} f] = 0; \\ I_{4} & = & - \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} [a_{1 \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{1} f \\ + a_{2 \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{2} f] = 0 . \end{matrix}

For the drift term in tensor

R_{a b}

, taking

b = - a a^{T} \nabla V

, we obtain

\begin{matrix} R_{b}^{a} & = & \sum_{i, k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{i \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{k k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{i} f] \\ + \sum_{i, k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{i \hat{i}}^{T} \frac{\partial a_{k k^{'}}^{T}}{\partial x_{\hat{i}}} a_{k \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{i} f] \\ + \sum_{i, k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{i \hat{i}}^{T} a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{i} f] \\ - \sum_{i, k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{i} f] \\ = & J_{1} + J_{2} + J_{3} + J_{4} . \end{matrix}

Plugging into the matrix

a^{T}

, we obtain

\begin{matrix} J_{1} & = & \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} \frac{\partial a_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{1 k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} \frac{\partial a_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{1 k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ + \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} \frac{\partial a_{2 \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{2 k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} \frac{\partial a_{2 \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{2 k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ = & β e^{β θ} {(a^{T} \nabla)}_{2} V \frac{\partial f}{\partial x} {(a^{T} \nabla)}_{1} f; \end{matrix}

\begin{matrix} J_{2} & = & \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} \frac{\partial a_{1 k^{'}}^{T}}{\partial x_{\hat{i}}} a_{1 \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} \frac{\partial a_{1 k^{'}}^{T}}{\partial x_{\hat{i}}} a_{1 \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ + \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} \frac{\partial a_{2 k^{'}}^{T}}{\partial x_{\hat{i}}} a_{2 \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} \frac{\partial a_{2 k^{'}}^{T}}{\partial x_{\hat{i}}} a_{2 \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ = & β e^{β θ} \frac{\partial V}{\partial x} {(a^{T} \nabla)}_{2} f {(a^{T} \nabla)}_{1} f; \end{matrix}

\begin{matrix} J_{3} & = & \sum_{\hat{i}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {| {(a^{T} \nabla)}_{1} f |}^{2} + a_{2 \hat{i}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f] \\ + \sum_{\hat{i}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {(a^{T} \nabla)}_{2} f {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {| {(a^{T} \nabla)}_{2} f |}^{2}] \\ = & \frac{\partial^{2} V}{\partial θ \partial θ} {| {(a^{T} \nabla)}_{1} f |}^{2} + 2 (e^{β θ} \frac{\partial^{2} V}{\partial θ \partial x} + \frac{\partial^{2} V}{\partial θ \partial y}) {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f \\ + \sum_{\hat{i}, k^{'} = 1}^{3} a_{2 \hat{i}}^{T} a_{2 k^{'}}^{T} \frac{p a^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}}) | {(a^{T} \nabla)}_{2} {f |}^{2}; \\ J_{4} & = & - \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{k}}^{T} a_{1 k^{'}}^{T} \frac{\partial a_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{1} f + a_{1 \hat{k}}^{T} a_{1 k^{'}}^{T} \frac{\partial a_{2 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{2} f] \\ - \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{2 \hat{k}}^{T} a_{2 k^{'}}^{T} \frac{\partial a_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{k}}^{T} a_{2 k^{'}}^{T} \frac{\partial a_{2 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{2} f] \\ = & - β e^{β θ} {(a^{T} \nabla)}_{1} V \frac{\partial f}{\partial x} {(a^{T} \nabla)}_{2} f . \end{matrix}

Combining the above computations, we obtain the tensor

R_{a b}

. Now, we turn to the second tensor

R_{z b}

, which has the following form:

\begin{matrix} R_{z b} (\nabla f, \nabla f) & = & \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial z_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{1} f 〉_{R} \\ + \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial z_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{1} f 〉_{R} \\ - \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 z_{1 \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(z^{T} \nabla)}_{1} {f 〉}_{R} \\ - \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 z_{1 \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}}, {(z^{T} \nabla)}_{1} f 〉_{R} \\ - \sum_{i = 1}^{2} \sum_{\hat{i}, \hat{k} = 1}^{3} 〈 (z_{1 \hat{i}}^{T} \frac{\partial b_{\hat{k}}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}} - b_{\hat{k}} \frac{\partial z_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(z^{T} \nabla f)}_{1} 〉_{R}, \\ = & I_{1}^{z} + I_{2}^{z} + I_{3}^{z} + I_{4}^{z} + R_{b}^{z} (\nabla f, \nabla f) . \end{matrix}

where we denote further that

\begin{matrix} R_{b}^{z} (\nabla f, \nabla f) = - \sum_{\hat{i}, \hat{k} = 1}^{3} (z_{1 \hat{i}}^{T} \frac{\partial b_{\hat{k}}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}} - b_{\hat{k}} \frac{\partial z_{i \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}) {(z^{T} \nabla f)}_{1} . \end{matrix}

By taking

b = - a a^{T} \nabla V

, we further obtain that

\begin{matrix} R_{b}^{z} (\nabla f, \nabla f) & = & - \sum_{\hat{i}, \hat{k} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial b_{\hat{k}}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla f)}_{1} - b_{\hat{k}} \frac{\partial z_{i \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}} {(z^{T} \nabla f)}_{1}] \\ = & \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{k k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \\ + \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial a_{k k^{'}}^{T}}{\partial x_{\hat{i}}} a_{k \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \end{matrix}

\begin{matrix} + \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \\ - \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial z_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(z^{T} \nabla)}_{1} f] \\ = & J_{1}^{z} + J_{2}^{z} + J_{3}^{z} + J_{4}^{z} . \end{matrix}

By direct computations, it is not hard to observe that

\begin{matrix} I_{1}^{z} & = & \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial z_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{1} f 〉_{R} = 0; \\ I_{2}^{z} & = & \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial z_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{1} f 〉_{R} \\ = & \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i} = 1}^{3} a_{i i^{'}}^{T} a_{i \hat{i}}^{T} \frac{\partial^{2} z_{1 \hat{k}}^{T}}{\partial x_{i^{'}} \partial x_{\hat{i}}} \partial_{y} f {(z^{T} \nabla)}_{1} f; \\ I_{3}^{z} & = & - \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 z_{1 \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(z^{T} \nabla)}_{1} {f 〉}_{R} = 0; \\ I_{4}^{z} & = & - \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 z_{1 \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}}, {(z^{T} \nabla)}_{1} f 〉_{R} = 0, \end{matrix}

and

\begin{matrix} J_{1}^{z} & = & \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{k k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] = 0; \\ J_{2}^{z} & = & \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial a_{k k^{'}}^{T}}{\partial x_{\hat{i}}} a_{k \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] = 0; \\ J_{4}^{z} & = & - \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial z_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(z^{T} \nabla)}_{1} f] \\ = & - \sum_{k = 1}^{2} {(a^{T} \nabla)}_{k} z_{13}^{T} {(a^{T} \nabla)}_{k} V \partial_{y} f {(z^{T} \nabla)}_{1} f; \\ J_{3}^{z} & = & \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \\ = & \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} a_{1 \hat{k}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f + z_{1 \hat{i}}^{T} a_{2 \hat{k}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \\ = & \sum_{\hat{i}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f + z_{1 \hat{i}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {(a^{T} \nabla)}_{2} f {(z^{T} \nabla)}_{1} f] \\ = & - g \frac{\partial^{2} V}{\partial θ \partial y} {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f - g (e^{β θ} \frac{\partial^{2} V}{\partial x \partial y} + \frac{\partial^{2} V}{\partial y \partial y}) {(a^{T} \nabla)}_{2} f {(z^{T} \nabla)}_{1} f . \end{matrix}

Now, we are left to compute the term

R_{π}

. Recall that

\begin{matrix} R_{π} (\nabla f, \nabla f) \\ = & 2 \sum_{k = 1}^{1} \sum_{i = 1}^{2} \sum_{k^{'}, \hat{k}, \hat{i}, i^{'} = 1}^{3} [\frac{\partial}{\partial x_{k^{'}}} z_{k k^{'}}^{T} z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}}] \\ + 2 \sum_{k = 1}^{1} \sum_{i = 1}^{2} \sum_{k^{'}, \hat{k}, \hat{i}, i^{'} = 1}^{3} [z_{k k^{'}}^{T} \frac{\partial}{\partial x_{k^{'}}} z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} \\ + z_{k k^{'}}^{T} z_{k \hat{k}}^{T} \frac{\partial^{2}}{\partial x_{k^{'}} \partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} \\ + z_{k k^{'}}^{T} z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} \frac{\partial}{\partial x_{k^{'}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}}] . \\ + 2 \sum_{k = 1}^{1} \sum_{i = 1}^{2} \sum_{\hat{k}, \hat{i}, i^{'} = 1}^{3} {(z^{T} \nabla log π)}_{k} [z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}}] \\ - 2 \sum_{j = 1}^{1} \sum_{l = 1}^{2} \sum_{l^{'}, \hat{l}, \hat{j}, j^{'} = 1}^{3} [\frac{\partial}{\partial x_{l^{'}}} a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}}] \\ - 2 \sum_{j = 1}^{1} \sum_{l = 1}^{2} \sum_{l^{'}, \hat{l}, \hat{j}, j^{'} = 1}^{3} [a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} \\ + a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial^{2}}{\partial x_{l^{'}} \partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} \\ + a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} \frac{\partial}{\partial x_{l^{'}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}}] \\ - 2 \sum_{j = 1}^{1} \sum_{l = 1}^{2} \sum_{\hat{l}, \hat{j}, j^{'} = 1}^{3} {(a^{T} \nabla log π)}_{l} [a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}}] \\ = & \sum_{i = 1}^{10} K_{i} . \end{matrix}

By direct computation, we obtain

\begin{matrix} K_{1} & = & 0, K_{2} = 0, K_{3} = 0, K_{4} = 0, K_{5} = 0, K_{6} = 0, K_{7} = 0; \\ K_{8} & = & - 2 \sum_{l = 1}^{2} \sum_{l^{'}, \hat{l} = 1}^{3} a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial^{2} z_{13}^{T}}{\partial x_{l^{'}} \partial x_{\hat{l}}} \partial_{y} f {(z^{T} \nabla)}_{1} f; \\ K_{9} & = & - 2 \sum_{l = 1}^{2} \sum_{l^{'}, \hat{l} = 1}^{3} a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial z_{13}^{T}}{\partial x_{\hat{l}}} \frac{\partial z_{13}^{T}}{\partial x_{l^{'}}} | \partial_{y} {f |}^{2} = - 2 Γ_{1} (log g, log g) {| {(z^{T} \nabla)}_{1} f |}^{2}; \\ K_{10} & = & - 2 \sum_{l = 1}^{2} \sum_{\hat{l} = 1}^{3} {(a^{T} \nabla)}_{l} log π a_{l \hat{l}}^{T} \frac{\partial z_{13}^{T}}{\partial x_{\hat{l}}} \partial_{y} f {(z^{T} \nabla)}_{1} f = - 2 Γ_{1} (log π, log g) {| {(z^{T} \nabla)}_{1} f |}^{2} . \end{matrix}

□

3.3. Martinet Flat Sub-Riemannian Structure

In this part, we apply our result to the Martinet flat sub-Riemannian structure, which satisfies the bracket-generating condition and has a non-equiregular sub-Riemannian structure (see [37]). The sub-Riemannian structure is defined on

R^{3}

through the kernel of one-form

η : = d z - \frac{1}{2} y^{2} d x .

A global orthonormal basis for the horizontal distribution

H

adapts the following differential operator representation, in local coordinates

(x, y, z)

:

X = \frac{\partial}{\partial x} + \frac{y^{2}}{2} \frac{\partial}{\partial z}, Y = \frac{\partial}{\partial y} .

The commutative relation gives

\begin{matrix} [X, Y] = - y Z, [Y, [X, Y]] = Z, where Z = \frac{\partial}{\partial z} . \end{matrix}

To apply it in our framework, we take

\begin{matrix} a & = & (\begin{matrix} 1 & 0 \\ 0 & 1 \\ \frac{y^{2}}{2} & 0 \end{matrix}), a^{T} = (\begin{matrix} 1 & 0 & \frac{y^{2}}{2} \\ 0 & 1 & 0 \end{matrix}), \\ z^{T} & = & (0, 0, 1), a a^{T} = (\begin{matrix} 1 & 0 & \frac{y^{2}}{2} \\ 0 & 1 & 0 \\ \frac{y^{2}}{2} & 0 & \frac{y^{4}}{4} \end{matrix}) . \end{matrix}

(27)

Thus, the sub-Riemannian structure has the form

(M, H, {(a a^{T})}_{| H}^{†})

.

Proposition 5.

In this setting,

π = e^{- \frac{y^{2}}{2} - V},

then

- a a^{T} \nabla log π = a \otimes \nabla a + a a^{T} \nabla V .

Proof.

The poof follows from the observation that

a \otimes \nabla a = {(\begin{matrix} 0 & y & 0 \end{matrix})}^{T}, a a^{T} \nabla log e^{- \frac{y^{2}}{2}} = {(\begin{matrix} 0 & y & 0 \end{matrix})}^{T} .

□

Similar to the previous displacement group case, we have the following identity.

Proposition 6.

For any smooth function

f \in C^{\infty} (M)

, one has

\begin{matrix} Γ_{2} (f, f) + Γ_{2}^{z, π} (f, f) & = & ∥ {Hess}_{a, z} {f ∥}^{2} + R (\nabla f, \nabla f), \end{matrix}

where

\begin{matrix} Λ_{1}^{T} & = & (0, y \partial_{z} f / 2, 0, y \partial_{z} f / 2, 0, 0, 0, 0, 0); \\ Λ_{2}^{T} & = & (0, 0, 0, 0, 0, 0, - y \partial_{y} f, \frac{y^{3}}{2} \partial_{z} f + y \partial_{x} f, 0); \\ R_{a b} (\nabla f, \nabla f) - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E \\ = & \frac{y^{2}}{2} Γ_{1}^{z} (f, f) - y^{2} Γ_{1} (f, f) \\ + \frac{\partial f}{\partial z} {(a^{T} \nabla)}_{1} f + y {(a^{T} \nabla)}_{1} V \frac{\partial f}{\partial z} {(a^{T} \nabla)}_{2} f + y \frac{\partial V}{\partial z} {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f \\ + \sum_{\hat{i}, k^{'} = 1}^{3} a_{1 \hat{i}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {| {(a^{T} \nabla)}_{1} f |}^{2} + 2 (\frac{\partial^{2} V}{\partial x \partial y} + \frac{y^{2}}{2} \frac{\partial^{2} V}{\partial y \partial z}) {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f \end{matrix}

\begin{matrix} + \frac{\partial^{2} V}{\partial y \partial y} {| {(a^{T} \nabla)}_{2} f |}^{2} - y \frac{\partial V}{\partial y} \frac{\partial f}{\partial z} {(a^{T} \nabla)}_{1} f; \\ R_{z b} (\nabla f, \nabla f) & = & (\frac{\partial^{2} V}{\partial x \partial z} + \frac{y^{2}}{2} \frac{\partial^{2} V}{\partial z \partial z}) {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f + \frac{\partial^{2} V}{\partial y \partial z} {(a^{T} \nabla)}_{2} f {(z^{T} \nabla)}_{1} f; \\ R_{π} (\nabla f, \nabla f) & = & 0 . \end{matrix}

In particular, we have

\begin{matrix} \sum_{\hat{i}, k^{'} = 1}^{3} a_{1 \hat{i}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} | {(a^{T} \nabla)}_{1} {f |}^{2} = (\frac{\partial^{2} V}{\partial x \partial x} + y^{2} \frac{\partial^{2} V}{\partial x \partial z} + \frac{y^{4}}{4} \frac{\partial^{2} V}{\partial z \partial z}) {| {(a^{T} \nabla)}_{1} f |}^{2} . \end{matrix}

The proof of Proposition 6 follows from the proof of Theorem 1 (i.e., Theorem 3) and Lemmas 7–9 below. The following convergence results are a direct consequence of Theorem 2.

Proposition 7.

If there exists

κ > 0

as shown in Theorem 2, the exponential dissipation result in the

L^{1}

distance holds:

\int | ρ (t, x) - π (x) | d x = O (e^{- κ t}) .

Similarly, we summarize the sub-Riemannian Ricci tensor in terms of

R

as follows.

Corollary 3.

The matrix

R

associated with the Martinet sub-Riemannian structure has the following form:

\begin{matrix} R_{11} & = & (\frac{\partial^{2} V}{\partial x \partial x} + y^{2} \frac{\partial^{2} V}{\partial x \partial z} + \frac{y^{4}}{4} \frac{\partial^{2} V}{\partial z \partial z}) - y^{2}; \\ R_{22} & = & \frac{\partial^{2} V}{\partial y \partial y} - y^{2}; R_{33} = \frac{y^{2}}{2}; \\ R_{12} & = & R_{21} = \frac{y}{2} \frac{\partial V}{\partial z} + (\frac{\partial^{2} V}{\partial x \partial y} + \frac{y^{2}}{2} \frac{\partial^{2} V}{\partial y \partial z}); \\ R_{13} & = & R_{31} = \frac{1}{2} - \frac{y}{2} \frac{\partial V}{\partial y} + \frac{1}{2} (\frac{\partial^{2} V}{\partial x \partial z} + \frac{y^{2}}{2} \frac{\partial^{2} V}{\partial z \partial z}); R_{23} = R_{32} = \frac{1}{2} y {(a^{T} \nabla)}_{1} V + \frac{1}{2} \frac{\partial^{2} V}{\partial y \partial z} . \end{matrix}

Proof.

The proof follows from the similar equivalent matrix formulation as shown in the proof of Corollary 1 and the explicit bilinear forms in Proposition 6. □

Next, we prove the following three key lemmas.

Lemma 7.

For Martinet sub-Riemannian structure

(M, H, {(a a^{T})}_{| H}^{†})

, we have

\begin{matrix} Q & = & (\begin{matrix} 1 & 0 & \frac{y^{2}}{2} & 0 & 0 & 0 & \frac{y^{2}}{2} & 0 & \frac{y^{4}}{4} \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & \frac{y^{2}}{2} & 0 \\ 0 & 0 & 0 & 1 & 0 & \frac{y^{2}}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \end{matrix}); \\ P & = & (\begin{matrix} 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & y^{2} / 2 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \end{matrix}); \\ C^{T} & = & (0, 0, 0, 0, 0, \frac{y^{3}}{2} \partial_{z} f + y \partial_{x} f, - y \partial_{y} f, 0, - \frac{y^{3}}{2} \partial_{y} f); \\ D^{T} & = & (0, 0, y \partial_{z} f, 0), E^{T} = (0, 0); \\ F^{T} & = & G^{T} = (0, 0, 0, 0, 0, 0, 0, 0, 0) . \end{matrix}

Proof.

Plugging matrices a and z from (27) into Notation 1, we complete the proof. □

Lemma 8.

For the Martinet sub-Riemannian structure, F and G are zero vectors, and we have

\begin{matrix} {[Q X + D]}^{T} [Q X + D] + {[P X + E]}^{T} [P X + E] + 2 C^{T} X \\ = & ∥ {Hess}_{a, z} {f ∥}^{2} - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E . \end{matrix}

In particular, we have

\begin{matrix} ∥ {Hess}_{a, z} {f ∥}^{2} & = & {[X + Λ_{1}]}^{T} Q^{T} Q [X + Λ_{1}] + {[X + Λ_{2}]}^{T} P^{T} P [X + Λ_{2}]; \\ Λ_{1}^{T} & = & (0, y \partial_{z} f / 2, 0, y \partial_{z} f / 2, 0, 0, 0, 0, 0); \\ Λ_{2}^{T} & = & (0, 0, 0, 0, 0, 0, - y \partial_{y} f, \frac{y^{3}}{2} \partial_{z} f + y \partial_{x} f, 0); \\ - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E \\ = & \frac{y^{2}}{2} Γ_{1}^{z} (f, f) - y^{2} Γ_{1} (f, f) . \end{matrix}

Lemma 9.

By routine computations, we obtain

\begin{matrix} R_{a b} (\nabla f, \nabla f) \\ = & \frac{\partial f}{\partial z} {(a^{T} \nabla)}_{1} f + y {(a^{T} \nabla)}_{1} V \frac{\partial f}{\partial z} {(a^{T} \nabla)}_{2} f + y \frac{\partial V}{\partial z} {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f \\ + \sum_{\hat{i}, k^{'} = 1}^{3} a_{1 \hat{i}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {| {(a^{T} \nabla)}_{1} f |}^{2} + 2 (\frac{\partial^{2} V}{\partial x \partial y} + \frac{y^{2}}{2} \frac{\partial^{2} V}{\partial y \partial z}) {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f \\ + \frac{\partial^{2} V}{\partial y \partial y} {| {(a^{T} \nabla)}_{2} f |}^{2} - y \frac{\partial V}{\partial y} \frac{\partial f}{\partial z} {(a^{T} \nabla)}_{1} f; \\ R_{z b} (\nabla f, \nabla f) \\ = & (\frac{\partial^{2} V}{\partial x \partial z} + \frac{y^{2}}{2} \frac{\partial^{2} V}{\partial z \partial z}) {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f + \frac{\partial^{2} V}{\partial y \partial z} {(a^{T} \nabla)}_{2} f {(z^{T} \nabla)}_{1} f; \\ R_{π} (\nabla f, \nabla f) = 0 . \end{matrix}

Proof of Lemma 8.

Since F and G are zero vectors, we have

\begin{matrix} 2 C^{T} X = 2 [\frac{\partial^{2} f}{\partial y \partial z} (\frac{y^{3}}{2} \partial_{z} f + y \partial_{x} f) - \frac{\partial^{2} f}{\partial x \partial z} (y \partial_{y} f) - \frac{\partial^{2} f}{\partial z \partial z} (\frac{y^{3}}{2} \partial_{y} f)] . \end{matrix}

By routine computation, we observe that

\begin{matrix} {[Q X + D]}^{T} [Q X + D] + {[P X + E]}^{T} [P X + E] + 2 C^{T} X \\ = & {[\frac{\partial^{2} f}{\partial x \partial x} + \frac{y^{2}}{2} \frac{\partial^{2} f}{\partial x \partial z} + \frac{y^{2}}{2} \frac{\partial^{2} f}{\partial z \partial x} + \frac{y^{4}}{4} \frac{\partial^{2} f}{\partial z \partial z}]}^{2} + {[\frac{\partial^{2} f}{\partial y \partial x} + \frac{y^{2}}{2} \frac{\partial^{2} f}{\partial z \partial y} + y \partial_{z} f]}^{2} \\ + {[\frac{\partial^{2} f}{\partial y \partial x} + \frac{y^{2}}{2} \frac{\partial^{2} f}{\partial z \partial y}]}^{2} + {[\frac{\partial^{2} f}{\partial y \partial y}]}^{2} \\ + {[\frac{\partial^{2} f}{\partial z \partial x} + \frac{y^{2}}{2} \frac{\partial^{2} f}{\partial z \partial z}]}^{2} + {[\frac{\partial^{2} f}{\partial z \partial y}]}^{2} \\ + 2 \frac{\partial^{2} f}{\partial y \partial z} (\frac{y^{3}}{2} \partial_{z} f + y \partial_{x} f) - 2 \frac{\partial^{2} f}{\partial x \partial z} (y \partial_{y} f) - 2 \frac{\partial^{2} f}{\partial z \partial z} (\frac{y^{3}}{2} \partial_{y} f) \end{matrix}

\begin{matrix} = & {[\frac{\partial^{2} f}{\partial x \partial x} + \frac{y^{2}}{2} \frac{\partial^{2} f}{\partial x \partial z} + \frac{y^{2}}{2} \frac{\partial^{2} f}{\partial z \partial x} + \frac{y^{4}}{4} \frac{\partial^{2} f}{\partial z \partial z}]}^{2} + {[\frac{\partial^{2} f}{\partial y \partial x} + \frac{y^{2}}{2} \frac{\partial^{2} f}{\partial z \partial y} + y \partial_{z} f]}^{2} \\ + {[\frac{\partial^{2} f}{\partial y \partial x} + \frac{y^{2}}{2} \frac{\partial^{2} f}{\partial z \partial y}]}^{2} + {[\frac{\partial^{2} f}{\partial y \partial y}]}^{2} \\ + {[\frac{\partial^{2} f}{\partial z \partial x} + \frac{y^{2}}{2} \frac{\partial^{2} f}{\partial z \partial z} - y \partial_{y} f]}^{2} + {[\frac{\partial^{2} f}{\partial z \partial y} + (\frac{y^{3}}{2} \partial_{z} f + y \partial_{x} f)]}^{2} \\ - y^{2} {| \partial_{y} f |}^{2} - {(\frac{y^{3}}{2} \partial_{z} f + y \partial_{x} f)}^{2} \\ = & | {Hess}_{a, z} {f |}^{2} + \frac{y^{2}}{2} Γ_{1}^{z} (f, f) - y^{2} Γ_{1} (f, f), \end{matrix}

where we use the fact

\begin{matrix} {[\frac{\partial^{2} f}{\partial y \partial x} + \frac{y^{2}}{2} \frac{\partial^{2} f}{\partial z \partial y} + y \partial_{z} f]}^{2} + {[\frac{\partial^{2} f}{\partial y \partial x} + \frac{y^{2}}{2} \frac{\partial^{2} f}{\partial z \partial y}]}^{2} \\ = & 2 {[\frac{\partial^{2} f}{\partial y \partial x} + \frac{y^{2}}{2} \frac{\partial^{2} f}{\partial z \partial y} + \frac{1}{2} y \partial_{z} f]}^{2} + \frac{y^{2}}{2} {| \partial_{z} f |}^{2} . \end{matrix}

The proof is thus completed. □

We are now left to compute the three tensor terms.

Proof of Lemma 9.

Similar to the proof of Lemma 6, we have

\begin{matrix} R_{a} (\nabla f, \nabla f) & = & \sum_{i, k = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{2}} \\ + \sum_{i, k = 2}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{2}} \\ - \sum_{i, k = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{k \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(a^{T} \nabla)}_{k} {f 〉}_{R^{2}} \\ - \sum_{i, k = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} 〈 a_{k \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}}, {(a^{T} \nabla)}_{k} f 〉_{R^{2}}, \\ = & I_{1} + I_{2} + I_{3} + I_{4} . \end{matrix}

By direct computations, we have

\begin{matrix} I_{1} & = & \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} [a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial a_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}) {(a^{T} \nabla)}_{1} f + a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial a_{2 \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}) {(a^{T} \nabla)}_{2} f] = 0; \\ I_{2} & = & \sum_{i = 2}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} [a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial a_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}) {(a^{T} \nabla)}_{1} f + a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial a_{2 \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}) {(a^{T} \nabla)}_{2} f] \\ = & a_{22}^{T} a_{22}^{T} \frac{\partial^{2}}{\partial y \partial y} a_{13}^{T} \frac{\partial f}{\partial z} {(a^{T} \nabla)}_{1} f = \frac{\partial f}{\partial z} {(a^{T} \nabla)}_{1} f; \\ I_{3} & = & - \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} [a_{1 \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}) {(a^{T} \nabla)}_{1} f + a_{2 \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}) {(a^{T} \nabla)}_{2} f] = 0; \\ I_{4} & = & - \sum_{i = 1}^{2} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{3} [a_{1 \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{2} f] = 0 . \end{matrix}

For the drift term, we take

b = - a a^{T} \nabla V

\begin{matrix} R_{b}^{a} & = & \sum_{i, k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{i \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{k k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{i} f + a_{i \hat{i}}^{T} \frac{\partial a_{k k^{'}}^{T}}{\partial x_{\hat{i}}} a_{k \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{i} f] \\ + \sum_{i, k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{i \hat{i}}^{T} a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{i} f] \\ - \sum_{i, k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{i} f] \\ = & J_{1} + J_{2} + J_{3} + J_{4} . \end{matrix}

Plugging into the matrices of

a^{T}

, we obtain

\begin{matrix} J_{1} & = & \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} \frac{\partial a_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{1 k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} \frac{\partial a_{1 \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{1 k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ + \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} \frac{\partial a_{2 \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{2 k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} \frac{\partial a_{2 \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{2 k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ = & a_{22}^{T} \frac{\partial a_{13}^{T}}{\partial y} {(a^{T} \nabla)}_{1} V \frac{\partial f}{\partial z} {(a^{T} \nabla)}_{2} f = y {(a^{T} \nabla)}_{1} V \frac{\partial f}{\partial z} {(a^{T} \nabla)}_{2} f; \end{matrix}

\begin{matrix} J_{2} & = & \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} \frac{\partial a_{1 k^{'}}^{T}}{\partial x_{\hat{i}}} a_{1 \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} \frac{\partial a_{1 k^{'}}^{T}}{\partial x_{\hat{i}}} a_{1 \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ + \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} \frac{\partial a_{2 k^{'}}^{T}}{\partial x_{\hat{i}}} a_{2 \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} \frac{\partial a_{2 k^{'}}^{T}}{\partial x_{\hat{i}}} a_{2 \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(a^{T} \nabla)}_{2} f] \\ = & y \frac{\partial V}{\partial z} {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f; \end{matrix}

\begin{matrix} J_{3} & = & \sum_{\hat{i}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {| {(a^{T} \nabla)}_{1} f |}^{2} + a_{2 \hat{i}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f] \\ + \sum_{\hat{i}, k^{'} = 1}^{3} [a_{1 \hat{i}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {(a^{T} \nabla)}_{2} f {(a^{T} \nabla)}_{1} f + a_{2 \hat{i}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {| {(a^{T} \nabla)}_{2} f |}^{2}] \\ = & \sum_{\hat{i}, k^{'} = 1}^{3} a_{1 \hat{i}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {| {(a^{T} \nabla)}_{1} f |}^{2} + 2 (\frac{\partial^{2} V}{\partial x \partial y} + \frac{y^{2}}{2} \frac{\partial^{2} V}{\partial y \partial z}) {(a^{T} \nabla)}_{1} f {(a^{T} \nabla)}_{2} f \\ + \frac{\partial^{2} V}{\partial y \partial y} {| {(a^{T} \nabla)}_{2} f |}^{2}; \end{matrix}

\begin{matrix} J_{4} & = & - \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{1 \hat{k}}^{T} a_{1 k^{'}}^{T} \frac{\partial a_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{1} f + a_{1 \hat{k}}^{T} a_{1 k^{'}}^{T} \frac{\partial a_{2 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{2} f] \end{matrix}

\begin{matrix} - \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{2 \hat{k}}^{T} a_{2 k^{'}}^{T} \frac{\partial a_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{1} f + a_{2 \hat{k}}^{T} a_{2 k^{'}}^{T} \frac{\partial a_{2 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(a^{T} \nabla)}_{2} f] \\ = & - y \frac{\partial V}{\partial y} \frac{\partial f}{\partial z} {(a^{T} \nabla)}_{1} f . \end{matrix}

Combing the above computations, we obtain the tensor

R_{a b}

. Now, we turn to the second tensor

R_{z b}

. Since

z^{T} = (0, 0, 1)

, it is obvious to see that only the drift term of the tensor

R_{z b}

remains, where we denote

\begin{matrix} R_{b}^{z} (\nabla f, \nabla f) = - \sum_{\hat{i}, \hat{k} = 1}^{3} (z_{1 \hat{i}}^{T} \frac{\partial b_{\hat{k}}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}} - b_{\hat{k}} \frac{\partial z_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}) {(z^{T} \nabla)}_{1} f . \end{matrix}

By taking

b = - a a^{T} \nabla V

, we further obtain that

\begin{matrix} R_{b}^{z} (\nabla f, \nabla f) & = & - \sum_{\hat{i}, \hat{k} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial b_{\hat{k}}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla f)}_{1} - b_{\hat{k}} \frac{\partial z_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}} {(z^{T} \nabla f)}_{1}] \\ = & \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{k k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \\ + \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial a_{k k^{'}}^{T}}{\partial x_{\hat{i}}} a_{k \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \\ + \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \\ - \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial z_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(z^{T} \nabla)}_{1} f] \\ = & J_{1}^{z} + J_{2}^{z} + J_{3}^{z} + J_{4}^{z} . \end{matrix}

By direct computations, it is not hard to observe that

\begin{matrix} J_{1}^{z} & = & \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} a_{k k^{'}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] = 0; \\ J_{2}^{z} & = & \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} \frac{\partial a_{k k^{'}}^{T}}{\partial x_{\hat{i}}} a_{k \hat{k}}^{T} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] = 0; \\ J_{4}^{z} & = & - \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial z_{1 \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial V}{\partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{i}}} {(z^{T} \nabla)}_{1} f] = 0 . \end{matrix}

The only non-zero term has the following form:

\begin{matrix} J_{3}^{z} & = & \sum_{k = 1}^{2} \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} a_{k \hat{k}}^{T} a_{k k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \\ = & \sum_{\hat{i}, \hat{k}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} a_{1 \hat{k}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f + z_{1 \hat{i}}^{T} a_{2 \hat{k}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} \frac{\partial f}{\partial x_{\hat{k}}} {(z^{T} \nabla)}_{1} f] \\ = & \sum_{\hat{i}, k^{'} = 1}^{3} [z_{1 \hat{i}}^{T} a_{1 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f + z_{1 \hat{i}}^{T} a_{2 k^{'}}^{T} \frac{\partial^{2} V}{\partial x_{\hat{i}} \partial x_{k^{'}}} {(a^{T} \nabla)}_{2} f {(z^{T} \nabla)}_{1} f] \\ = & (\frac{\partial^{2} V}{\partial x \partial z} + \frac{y^{2}}{2} \frac{\partial^{2} V}{\partial z \partial z}) {(a^{T} \nabla)}_{1} f {(z^{T} \nabla)}_{1} f + \frac{\partial^{2} V}{\partial y \partial z} {(a^{T} \nabla)}_{2} f {(z^{T} \nabla)}_{1} f . \end{matrix}

Since matrix

z^{T}

is a constant matrix and matrix

a^{T}

contains only variable y, it is easy to observe that

R_{π} (\nabla f, \nabla f) = 0 .

□

4. Lyapunov Analysis in Sub-Riemannian Density Manifold

In this section, we illustrate the motivation of this paper, which is to design a matrix condition, whose smallest eigenvalue characterizes the convergence rate of the degenerate SDE.

The outline of this section is given below. Consider a density space over the sub-Riemannian manifold. The finite-dimensional sub-Riemannian structure introduces the density space the infinite-dimensional sub-Riemannian structure. We name it the sub-Riemannian density manifold (SDM). We provide the geometric calculations in the SDM. We studied the Fokker–Planck equation as the sub-Riemannian gradient flow in the SDM. We derived the equivalence relation between the second-order calculus of the relative entropy in the SDM and the generalized Gamma z calculus.

4.1. Sub-Riemannian Density Manifold

Given a finite-dimensional sub-Riemannian manifold

(R^{n + m}, τ, g_{τ})

with

g_{τ} = {(a a^{T})}^{†}

, consider the probability density space:

P (R^{n + m}) = \{ρ (x) \in C^{\infty} (R^{n + m}) : \int ρ (x) d x = 1, ρ (x) \geq 0\} .

Consider the tangent space at

ρ \in P (R^{n + m})

:

T_{ρ} P (R^{n + m}) = {σ (x) \in C^{\infty} (R^{n + m}) : \int σ (x) d x = 0} .

We introduce the sub-Riemannian structure in probability density space

P (R^{n + m})

.

Definition 3

(sub-Riemannian Wasserstein metric tensor). The

L^{2}

sub-Riemannian-Wasserstein metric

g_{ρ}^{W_{a}} : T_{ρ} P (R^{n + m}) \times T_{ρ} P (R^{n + m}) \to R

is defined by

g_{ρ}^{W_{a}} (σ_{1}, σ_{2}) = \int (σ_{1} (x), {(- Δ_{ρ}^{a})}^{†} σ_{2} (x)) d x .

Here,

σ_{1}, σ_{2} \in T_{ρ} P (R^{n + m})

,

(\cdot, \cdot)

is the metric on

R^{n + m}

, and

{(Δ_{ρ}^{a})}^{†} : T_{ρ} P (R^{n + m}) \to T_{ρ} P (R^{n + m})

is the pseudo-inverse of the sub-elliptic operator:

Δ_{ρ}^{a} = \nabla \cdot (ρ a a^{T} \nabla) .

For some special choices of a as studied in [19] or

a a^{T}

forming a positive definite matrix, then

Δ_{ρ}^{a}

is an elliptic operator. In this case,

(P (R^{n + m}), g^{W_{a}})

still forms a Riemannian density manifold. In general, given a sub-Riemannian manifold

(R^{n + m}, {(a a^{T})}^{†})

,

Δ_{ρ}^{a}

is only a sub-elliptic operator. Thus,

(P (R^{n + m}), g^{W_{a}})

forms an infinite-dimensional sub-Riemannian manifold.

We next present the sub-Riemannian calculus in

(P (R^{n + m}), g^{W_{a}})

, including both geodesics and the Hessian operator in the tangent bundle. Consider an identification map:

V : C^{\infty} (R^{n + m}) \to T_{ρ} P (R^{n + m}), V_{Φ} = - Δ_{ρ}^{a} Φ = - \nabla \cdot (ρ a a^{T} \nabla Φ) .

Here,

Φ \in T_{π} P (R^{n + m}) = C^{\infty} (R^{n + m}) / \sim

. This

T_{π} P (R^{n + m})

is the cotangent space in the SDM, and ∼ represents a constant shift relation. Thus,

g_{ρ}^{W_{a}} (V_{Φ_{1}}, V_{Φ_{2}}) = \int Γ_{1} (Φ_{1}, Φ_{2}) ρ (x) d x .

In other words,

\begin{matrix} g_{ρ}^{W_{a}} (V_{Φ_{1}}, V_{Φ_{2}}) = & \int V_{Φ_{1}} {(- Δ_{ρ}^{a})}^{†} V_{Φ_{2}} d x \\ = & \int Φ_{1} (- Δ_{ρ}^{a}) {(- Δ_{ρ}^{a})}^{†} (- Δ_{ρ}^{a}) Φ_{2} d x \\ = & \int (Φ_{1}, - Δ_{ρ}^{a} Φ_{2}) d x \\ = & \int Φ_{1} (- \nabla \cdot (ρ a a^{T} \nabla Φ_{2}) d x \\ = & \int (a^{T} \nabla Φ_{1}, a^{T} \nabla Φ_{2}) ρ d x, \end{matrix}

(28)

where the second equality holds by

(- Δ_{ρ}^{a}) {(- Δ_{ρ}^{a})}^{†} (- Δ_{ρ}^{a}) = - Δ_{ρ}^{a}

and the last equality holds by the integration by parts.

We next derive several basic geometric calculations in the SDM.

Proposition 8

(Geodesics in the SDM). The sub-Riemannian geodesics in the cotangent bundle forms

\{\begin{matrix} \partial_{t} ρ_{t} + \nabla \cdot (ρ_{t} a a^{T} \nabla Φ_{t}) = 0, \\ \partial_{t} Φ_{t} + \frac{1}{2} (a^{T} \nabla Φ_{t}, a^{T} \nabla Φ_{t}) = 0 . \end{matrix}

(29)

Proof.

We considered the Lagrangian formulation of geodesics in density. Here, the minimization of the geometric action functional forms

L (ρ_{t}, \partial_{t} ρ_{t}) = \int_{0}^{1} \int \frac{1}{2} (\partial_{t} ρ_{t}, {(- Δ_{ρ_{t}}^{a})}^{†} \partial_{t} ρ_{t}) d x d t,

where

ρ_{t} = ρ (t, x)

is a density path with fixed boundary points

ρ_{0}

,

ρ_{1}

. Then, the Euler–Lagrange equation in density space forms

\frac{\partial}{\partial t} δ_{\partial_{t} ρ_{t}} L (ρ_{t}, \partial_{t} ρ_{t}) = δ_{ρ_{t}} L (ρ_{t}, \partial_{t} ρ_{t}),

where

δ_{\partial_{t} ρ_{t}}

is the

L^{2}

first variation with respect to

\partial_{t} ρ_{t}

and

δ_{ρ_{t}}

is the

L^{2}

first variation with respect to

ρ_{t}

. Here,

\begin{matrix} \partial_{t} ({(- Δ_{ρ_{t}}^{a})}^{†} \partial_{t} ρ_{t}) = & δ_{ρ} \int \frac{1}{2} (\partial_{t} ρ, {(- Δ_{ρ_{t}}^{a})}^{†} \partial_{t} ρ_{t}) d x \\ = & - \frac{1}{2} (a^{T} \nabla {(- Δ_{ρ}^{a})}^{†} \partial_{t} ρ_{t}, a^{T} \nabla {(- Δ_{ρ}^{a})}^{†} \partial_{t} ρ_{t}), \end{matrix}

(30)

where the last equality uses the following fact:

\partial_{t} {(Δ_{ρ_{t}}^{a})}^{†} = - {(Δ_{ρ_{t}}^{a})}^{†} \cdot Δ_{\partial_{t} ρ_{t}}^{a} \cdot {(Δ_{ρ_{t}}^{a})}^{†},

Denote

\partial_{t} ρ_{t} = - Δ_{ρ_{t}}^{a} Φ_{t}

, then the Euler–Lagrange Equation (30) forms the sub-Riemannian geodesics flow (29). In other words,

\partial_{t} Φ + \frac{1}{2} (a^{T} \nabla Φ, a^{T} \nabla Φ) = 0 .

□

Proposition 9

(Gradient and Hessian operators in the SDM). Given a functional

F : P (R^{n + m}) \to R

, the gradient operator of

F

in

(P, g^{W_{a}})

satisfies

g r a d_{W_{a}} F (ρ) = - \nabla \cdot (ρ a a^{T} \nabla δ F (ρ)) .

The Hessian operator of

F

in

(P, g^{W_{a}})

satisfies

\begin{matrix} H e s s_{W_{a}} F (V_{Φ}, V_{Φ}) \\ = & \int \int (a {(y)}^{T} \nabla_{y}) (a {(x)}^{T} \nabla_{x}) δ^{2} F (ρ) (x, y) (a {(x)}^{T} \nabla_{x} Φ (x), a {(y)}^{T} \nabla_{y} Φ (y)) ρ (x) ρ (y) d x d y \\ + \int {Hess}_{a} δ F (ρ) (Φ, Φ) ρ d x, \end{matrix}

where

{Hess}_{a} δ F (ρ) (Φ, Φ) = \frac{1}{2} \{2 Γ_{1} (Γ_{1} (δ F, Φ), Φ) - Γ_{1} (Γ_{1} (Φ, Φ), δ F)\} .

Proof.

We first derive the sub-Riemannian gradient operator. We recall the identification map by

- Δ_{ρ}^{a} Φ = - \nabla \cdot (ρ a a^{T} \nabla Φ)

. Hence, the gradient operator in the SDM satisfies

\begin{matrix} {grad}_{W_{a}} F (ρ) = & {({(- Δ_{ρ}^{a})}^{†})}^{†} \frac{δ}{δ ρ (x)} F (ρ) \\ = & - Δ_{ρ}^{a} \frac{δ}{δ ρ (x)} F (ρ) \\ = & - \nabla \cdot (ρ a a^{T} \nabla \frac{δ}{δ ρ (x)} F (ρ)) . \end{matrix}

The Hessian operator in the SDM satisfies

{Hess}_{W_{a}} F (ρ) (V_{Φ}, V_{Φ}) = \frac{d^{2}}{d t^{2}} F (ρ_{t}) |_{t = 0},

where

(ρ_{t}, Φ_{t})

satisfies the geodesics Equation (29) with

ρ_{0} = ρ

,

Φ_{0} = Φ

. Notice the fact that

\begin{matrix} \frac{d}{d t} F (ρ_{t}) |_{t = 0} = & \int \partial_{t} ρ_{t} {δ F (ρ) d x |}_{t = 0} \\ = & \int (- \nabla \cdot (ρ a a^{T} \nabla Φ)) δ F (ρ) d x \\ = & \int (a^{T} \nabla δ F (ρ), a^{T} \nabla Φ) ρ d x . \end{matrix}

In addition,

\begin{matrix} \begin{matrix} \frac{d^{2}}{d t^{2}} F (ρ_{t}) {|_{t = 0} = \frac{d}{d t} \int (a^{T} \nabla δ F (ρ_{t}), a^{T} \nabla Φ_{t}) ρ_{t} d x |}_{t = 0} \\ = & \int \int δ^{2} F (ρ) (x, y) \partial_{t} ρ_{t} (x) \partial_{t} ρ_{t} (y) d x d y + \int (a^{T} \nabla δ F (ρ_{t}), a^{T} \nabla \partial_{t} Φ_{t}) ρ_{t} d x \\ + \int (a^{T} \nabla δ F (ρ_{t}), a^{T} \nabla \partial_{t} Φ_{t}) \partial_{t} ρ_{t} {d x |}_{t = 0} \\ = & \int \int δ^{2} F (ρ) (x, y) \nabla \cdot (ρ a a^{T} \nabla Φ) (x) \nabla \cdot (ρ a a^{T} \nabla Φ) (y) d x d y \\ - \frac{1}{2} \int (a^{T} \nabla δ F (ρ), a^{T} \nabla Γ_{1} (Φ, Φ)) ρ d x \\ + \int Γ_{1} (Φ, δ F (ρ)) (- \nabla \cdot (ρ a a^{T} \nabla Φ)) d x \\ = & \int \int (a {(y)}^{T} \nabla_{y}) (a {(x)}^{T} \nabla_{x}) δ^{2} F (ρ) (x, y) (a {(x)}^{T} \nabla_{x} Φ (x), a {(y)}^{T} \nabla_{y} Φ (y)) ρ (x) ρ (y) d x d y \\ + & \frac{1}{2} \int \{2 Γ_{1} (Γ_{1} (δ F, Φ), Φ) - Γ_{1} (Γ_{1} (Φ, Φ), δ F)\} ρ d x, \end{matrix} \end{matrix}

(31)

where the last equality holds by the integration by parts formula. □

We next show the equivalence relation between the Hessian of the relative entropy in the SDM and the classical Gamma two operator. We first demonstrate the relation among

L^{*}

,

Δ_{a}

and the gradient operator of the entropy. In particular, we show that the Fokker–Planck equation is a sub-Riemannian gradient flow in the SDM. Denote the KL divergence as

D (ρ) = D_{KL} (ρ ∥ π) = \int ρ (x) log \frac{ρ (x)}{π (x)} d x .

(32)

Proposition 10

(Gradient flow). The negative gradient operator in

(P, g^{W_{a}})

forms

- {grad}_{W_{a}} D (ρ) = L^{*} ρ = \nabla \cdot (ρ a a^{T} \nabla log \frac{ρ}{π}) .

In addition, the sub-Riemannian gradient flow of

D (ρ)

in

(P, g^{W_{a}})

forms the Fokker–Planck equation:

\partial_{t} ρ = \nabla \cdot (ρ a a^{T} \nabla log \frac{ρ}{π}) .

(33)

Proof.

We first derive the sub-Riemannian gradient operator of the entropy and relative entropy. Notice that

δ_{ρ (x)} D (ρ) = log ρ (x) + 1 - log π (x) .

Thus,

{grad}_{W_{a}} D (ρ) = - \nabla \cdot (ρ a a^{T} \nabla log ρ) + \nabla \cdot (ρ a a^{T} \nabla log π),

where

ρ \nabla log ρ = ρ \frac{\nabla ρ}{ρ} = \nabla ρ

. Following the gradient flow formulation:

\frac{\partial ρ_{t}}{\partial t} = - {grad}_{W_{a}} D (ρ_{t}) = L^{*} ρ_{t},

we finish the derivation of (33). □

We next demonstrate that the Hessian of the relative entropy (KL divergence) is equivalent to the classical Bakry–Émery calculus.

Proposition 11

(Hessian of entropy and Bakry–Émery calculus). Given

Φ_{1}

,

Φ_{2} \in C^{\infty} (R^{n + m})

, then

\begin{matrix} H e s s_{W_{a}} D (ρ) (V_{Φ}, V_{Φ}) = & \int Γ_{2} (Φ, Φ) ρ (x) d x . \end{matrix}

Proof.

We first derive the Hessian of

D (ρ)

in the SDM. Notice the fact that

δ^{2} D (ρ) (x, y) = \frac{1}{ρ} δ_{x = y}

. For simplicity, we denote

δ^{2} D (ρ) = \frac{1}{ρ (x)}

. By using (31), we have

\begin{matrix} {Hess}_{W_{a}} D (ρ) (V_{Φ}, V_{Φ}) = & \int δ^{2} D (ρ) (x) {(\nabla \cdot (ρ a a^{T} \nabla Φ))}^{2} d x & (a) \\ - \frac{1}{2} \int (a^{T} \nabla δ D (ρ), a^{T} \nabla Γ_{1} (Φ, Φ)) ρ d x & (b) \\ + \int Γ_{1} (Φ, δ D (ρ)) (- \nabla \cdot (ρ a a^{T} \nabla Φ)) d x . & (c) \end{matrix}

(34)

We next rewrite (34) into the iterative Gamma calculus. We first show that

\begin{matrix} (a) + (c) = & \int (δ^{2} D (ρ) \nabla \cdot (ρ a a^{T} \nabla Φ) - Γ_{1} (Φ, δ D (ρ))) \nabla \cdot (ρ a a^{T} \nabla Φ) ρ d x \\ = & \int (\frac{1}{ρ} \nabla \cdot (ρ a a^{T} \nabla Φ) - (a a^{T} \nabla log \frac{ρ}{π}, \nabla Φ)) \nabla \cdot (ρ a a^{T} \nabla Φ) ρ d x \\ = & \int (\frac{1}{ρ} (\nabla ρ, a a^{T} \nabla Φ) + \nabla \cdot (a a^{T} \nabla Φ) - (a a^{T} \nabla log \frac{ρ}{π}, \nabla Φ)) \nabla \cdot (ρ a a^{T} \nabla Φ) ρ d x \\ = & \int ((\nabla log ρ, a a^{T} \nabla Φ) + \nabla \cdot (a a^{T} \nabla Φ) \\ - (\nabla log ρ, a a^{T} \nabla Φ) + (a a^{T} \nabla log π, \nabla Φ)) \nabla \cdot (ρ a a^{T} \nabla Φ) ρ d x \\ = & \int ((\nabla \cdot (a a^{T} \nabla Φ) + (a a^{T} \nabla log π, \nabla Φ)) \nabla \cdot (ρ a a^{T} \nabla Φ) ρ d x \\ = & \int L Φ \nabla \cdot (ρ a a^{T} \nabla Φ) d x \\ = & - \int Γ_{1} (L Φ, Φ) ρ d x, \end{matrix}

where the fourth equality uses the fact that

\frac{\nabla ρ}{ρ} = \nabla log ρ

, while the last equality follows the integration by parts.

We secondly show that

\begin{matrix} (b) = & - \frac{1}{2} \int (a^{T} \nabla δ D (ρ), a^{T} \nabla Γ_{1} (Φ, Φ)) ρ d x \\ = & \frac{1}{2} \int Γ_{1} (Φ, Φ)) \nabla \cdot (ρ a a^{T} \nabla δ D (ρ)) d x \\ = & \frac{1}{2} \int Γ_{1} (Φ, Φ)) L^{*} ρ d x \\ = & \frac{1}{2} \int L Γ_{1} (Φ, Φ)) ρ d x, \end{matrix}

where the second equality applies the fact that

L^{*} ρ = \nabla \cdot (ρ a a^{T} \nabla δ D)

, while the last inequality uses the dual-relation between Kolmogorov operators L and

L^{*}

in

L^{2} (ρ)

, i.e.,

\int f (x) L^{*} ρ (t, x) d x = \int L f (x) ρ (t, x) d x, for any f \in C_{c}^{\infty} (R^{n + m}) .

Combining the equality of

(a), (b), (c)

, we prove the result. □

Remark 10.

We remark that the above formulations in terms of

a a^{T}

hold for both Riemannian and sub-Riemannian density manifolds. Here, the major difference is whether matrix function a is full rank or degenerate. In this sense, all formulas derived in this subsection recover the classical Bakry–Émery calculus. However, the classical Hessian operator of the entropy is not enough to study the convergence behavior of degenerate diffusion processes. Briefly, we use a modified Lyapunov functional and derive a tensor for the gradient flow in the SDM. It provides the convergence rate of the degenerate diffusion process.

4.2. Gamma z Calculus via Second-Order Calculus of Relative Entropy in SDM

In this subsection, we introduce the motivation of our new Gamma z calculus from the SDM viewpoint. Consider the SDM gradient flow (33):

\partial_{t} ρ_{t} = Δ_{ρ_{t}}^{a} δ D (ρ_{t}) .

When a is a degenerate matrix, the classical relative Fisher information

I_{a}

may not be the Lyapunov functional. In other words, along the gradient flow, it is possible that

\frac{d}{d t} I_{a} (ρ_{t}) \geq 0

.

To handle this issue, a new Lyapunov function is considered. It is to add a new direction z into the relative Fisher information functional. Denote

Δ_{ρ}^{z} = \nabla \cdot (ρ z z^{T} \nabla)

and

I_{z} (ρ) = \int (δ D, (- Δ_{ρ}^{z}) δ D) d x

. Construct

I_{a, z} (ρ) : = I_{a} (ρ) + I_{z} (ρ) = \int (δ D, (- Δ_{ρ}^{a} - Δ_{ρ}^{z}) δ D) d x .

We next prove the following proposition.

Proposition 12.

\frac{d}{d t} I_{a, z} (ρ_{t}) = - 2 \int (Γ_{2} (δ D, δ D) + {\tilde{Γ}}_{2}^{z} (δ D, δ D))) ρ_{t} d x,

where

\begin{matrix} {\tilde{Γ}}_{2}^{z} (Φ, Φ) : = & \frac{1}{2} L (Γ_{1}^{z} (Φ, Φ)) - Γ_{1} (L_{z} Φ, Φ), \end{matrix}

(35)

with the notation

Δ_{z} = \nabla \cdot (z z^{T} \nabla)

and

L_{z} = \nabla \cdot (z z^{T} \nabla) + (\nabla log π, z z^{T} \nabla)

.

Proof.

For the simplicity of notation, we denote

ρ = ρ_{t}

. Notice the fact that

\frac{d}{d t} I_{a, z} (ρ) = \frac{d}{d t} I_{a} (ρ) + \frac{d}{d t} I_{z} (ρ) .

From Proposition 11, we have

\begin{matrix} \frac{d^{2}}{d t^{2}} I_{a} (ρ) = & - 2 {Hess}_{g_{a}} D (V_{δ D}, V_{δ D}) \\ = & - 2 \int Γ_{2} (δ D, δ D) ρ d x . \end{matrix}

□

We only need to show the following claim.

Claim:

\frac{d}{d t} I_{z} (ρ) = - 2 \int {\tilde{Γ}}_{2}^{z} (δ D, δ D) ρ d x .

Proof of Claim.

The proof is similar to the ones in Proposition 11. We need to take care of the z direction. Notice that

\begin{matrix} \frac{d}{d t} I_{z} (ρ) = & 2 \int δ^{2} D ((- Δ_{ρ}^{z} δ D), \partial_{t} ρ) d x + \int (\nabla δ D, z z^{T} \nabla δ D) \partial_{t} ρ d x \\ = & 2 \int δ^{2} D ((- Δ_{ρ}^{z} δ D), Δ_{ρ}^{a} δ D) d x + \int (\nabla δ D, z z^{T} \nabla δ D) (Δ_{ρ}^{a} δ D) d x \\ = & - 2 \int \frac{1}{ρ} \nabla \cdot (ρ a a^{T} \nabla δ D) \nabla \cdot (ρ z z^{T} \nabla δ D) d x (I) \\ + \int (\nabla δ D, z z^{T} \nabla δ D) \nabla \cdot (ρ a a^{T} \nabla δ D) d x (I I) \end{matrix}

We next estimate (I) and (II) separately. For (I), we notice the fact that

\begin{matrix} \frac{1}{ρ} \nabla \cdot (ρ z z^{T} \nabla δ D) = & (\nabla log ρ, z z^{T} \nabla δ D) + \nabla \cdot (z z^{T} \nabla δ D) \\ = & (\nabla log \frac{ρ}{π}, z z^{T} \nabla δ D) + (\nabla log π, z z^{T} \nabla δ D) + \nabla \cdot (z z^{T} \nabla δ D) \\ = & (\nabla δ D, z z^{T} \nabla δ D) + (\nabla log π, z z^{T} \nabla δ D) + \nabla \cdot (z z^{T} \nabla δ D) . \end{matrix}

Thus,

\begin{matrix} (I) = & - 2 \int \frac{1}{ρ} \nabla \cdot (ρ a a^{T} \nabla δ D) \nabla \cdot (ρ z z^{T} \nabla δ D) d x \\ = & - 2 \int \nabla \cdot (ρ a a^{T} \nabla δ D) ((\nabla δ D, z z^{T} \nabla δ D) \\ + (\nabla log π, z z^{T} \nabla δ D) + \nabla \cdot (z z^{T} \nabla δ D)) d x \\ = & - 2 \int (\nabla δ D, z z^{T} \nabla δ D) L^{*} ρ \\ + \nabla \cdot (ρ a a^{T} \nabla δ D) ((\nabla log π, z z^{T} \nabla δ D) + \nabla \cdot (z z^{T} \nabla δ D)) d x \\ = & - 2 \int L (\nabla δ D, z z^{T} \nabla δ D) ρ d x \\ + 2 \int \{\nabla ((\nabla log π, z z^{T} \nabla δ D) + \nabla \cdot (z z^{T} \nabla δ D)), a a^{T} \nabla δ D\} ρ d x, \end{matrix}

where the last equality holds by integration by parts.

For (II), we have

\begin{matrix} (I I) = & \int (\nabla δ D, z z^{T} \nabla δ D) \nabla \cdot (ρ a a^{T} \nabla δ D) d x \\ = & \int (\nabla δ D, z z^{T} \nabla δ D) L^{*} ρ d x \\ = & \int L (\nabla δ D, z z^{T} \nabla δ D) ρ d x . \end{matrix}

Combining (I) and (II), we have

{\tilde{Γ}}_{2}^{z} (Φ, Φ) = \frac{1}{2} L (Γ_{1}^{z} (Φ, Φ)) - Γ_{1} (Δ_{z} Φ, Φ) - Γ_{1} (Γ_{1}^{z} (log π, Φ), Φ) .

Using the notation

L_{z} = Δ_{z} + (\nabla log π, z z^{T} \nabla)

, we finish the proof. □

We next prove that

{\tilde{Γ}}_{2}^{z}

and

Γ_{2}^{z, π}

in Definition 1 agree with each other in the weak form along the gradient flow.

Proposition 13.

Denote

Φ = δ D (ρ)

, then

\int {\tilde{Γ}}_{2}^{z} (Φ, Φ) ρ d x = \int Γ_{2}^{z, π} (Φ, Φ) ρ d x .

Proof.

To prove the proposition, we rewrite

{\tilde{Γ}}_{2}^{z}

as follows.

\begin{matrix} {\tilde{Γ}}_{2}^{z} (Φ, Φ) = & \frac{1}{2} L (Γ_{1}^{z} (Φ, Φ)) - Γ_{1} (L_{z} Φ, Φ) \\ = & \frac{1}{2} L (Γ_{1}^{z} (Φ, Φ)) - Γ_{1}^{z} (L Φ, Φ) \\ + Γ_{1}^{z} (L Φ, Φ) - Γ_{1} (L_{z} Φ, Φ) . \end{matrix}

□

Here, we need to prove the following equality.

Claim:

\begin{matrix} \int \{Γ_{1}^{z} (L Φ, Φ) - Γ_{1} (L_{z} Φ, Φ)\} ρ d x \\ = & \int ρ \{\frac{1}{π} \nabla \cdot (z z^{T} π (\nabla Φ, \nabla (a a^{T}) \nabla Φ)) - \frac{1}{π} \nabla \cdot (a a^{T} π (\nabla Φ, \nabla (z z^{T}) \nabla Φ))\} d x . \end{matrix}

Proof of Claim.

For the simplicity of notation, let

L^{*} ρ = \nabla \cdot (a a^{T} π \nabla \frac{ρ}{π}) = \nabla \cdot (ρ a a^{T} \nabla log \frac{ρ}{π})

and

L_{z}^{*} ρ = \nabla \cdot (z z^{T} π \nabla \frac{ρ}{π}) = \nabla \cdot (ρ z z^{T} \nabla log \frac{ρ}{π}) .

The following property is also used in the proof. For any smooth test function f and

Φ = log \frac{ρ}{π}

, then

\int L_{z}^{*} ρ f d x = - \int Γ_{1}^{z} (f, Φ) ρ d x, \int L^{*} ρ f d x = - \int Γ_{1} (f, Φ) ρ d x .

Notice that

Φ = log \frac{ρ}{π}

, then

\begin{matrix} \int Γ_{1}^{z} (L Φ, Φ) ρ d x \\ = & \int (\nabla (\nabla \cdot (a a^{T} \nabla Φ) - (A, \nabla Φ)), z z^{T} \nabla Φ) ρ d x \\ = & \int (\nabla (\nabla \cdot (a a^{T} \nabla Φ)), z z^{T} \nabla Φ) ρ d x - \int (\nabla (A, \nabla Φ), z z^{T} \nabla Φ) ρ d x . \\ (a 1) (a 2) \end{matrix}

Here,

\begin{matrix} (a 1) = & \int (\nabla (\nabla \cdot (a a^{T} \nabla Φ)), z z^{T} \nabla Φ) ρ d x \\ = & - \int \nabla \cdot (a a^{T} \nabla Φ) \nabla \cdot (ρ z z^{T} \nabla Φ) d x \\ = & - \int \nabla \cdot (a a^{T} \nabla log \frac{ρ}{π}) \nabla \cdot (ρ z z^{T} \nabla log \frac{ρ}{π}) d x \\ = & - \int \nabla \cdot (\frac{1}{ρ} a a^{T} π \nabla \frac{ρ}{π}) \nabla \cdot (ρ z z^{T} \nabla log \frac{ρ}{π}) d x \\ = & - \int \{(\nabla \frac{1}{ρ}, a a^{T} π \nabla \frac{ρ}{π}) + \frac{1}{ρ} \nabla \cdot (a a^{T} π \nabla \frac{ρ}{π})\} \nabla \cdot (ρ z z^{T} \nabla log \frac{ρ}{π}) d x \\ = & - \int (\nabla \frac{1}{ρ}, a a^{T} π \nabla \frac{ρ}{π}) L_{z}^{*} ρ d x - \int \frac{1}{ρ} L^{*} ρ L_{z}^{*} ρ d x \\ = & \int \frac{1}{ρ^{2}} (\nabla ρ, a a^{T} π \nabla \frac{ρ}{π}) L_{z}^{*} ρ d x - \int \frac{1}{ρ} L^{*} ρ L_{z}^{*} ρ d x \\ = & \int (\nabla log ρ, a a^{T} \nabla log \frac{ρ}{π}) L_{z}^{*} ρ d x - \int \frac{1}{ρ} L^{*} ρ L_{z}^{*} ρ d x \\ = & \int (\nabla log \frac{ρ}{π}, a a^{T} \nabla log \frac{ρ}{π}) L_{z}^{*} ρ d x \\ + \int (\nabla log π, a a^{T} \nabla log \frac{ρ}{π}) L_{z}^{*} ρ d x - \int \frac{1}{ρ} L^{*} ρ L_{z}^{*} ρ d x \\ = & - \int (\nabla (\nabla log \frac{ρ}{π}, a a^{T} \nabla log \frac{ρ}{π}), z z^{T} \nabla log \frac{ρ}{π}) ρ d x \\ - \int Γ_{1}^{z} ((\nabla log π, a a^{T} \nabla log \frac{ρ}{π}), log \frac{ρ}{π}) ρ d x - \int \frac{1}{ρ} L^{*} ρ L_{z}^{*} ρ d x \\ = & - \int ((\nabla log \frac{ρ}{π}, \nabla (a a^{T}) \nabla log \frac{ρ}{π}), z z^{T} \nabla \frac{ρ}{π}) π d x \\ - \int 2 \nabla^{2} log \frac{ρ}{π} (a a^{T} \nabla log \frac{ρ}{π}, z z^{T} \nabla log \frac{ρ}{π}) ρ d x \\ - \int Γ_{1}^{z} ((\nabla log π, a a^{T} \nabla log \frac{ρ}{π}), log \frac{ρ}{π}) ρ d x - \int \frac{1}{ρ} L^{*} ρ L_{z}^{*} ρ d x \end{matrix}

\begin{matrix} = & \int \nabla \cdot (z z^{T} π ((\nabla log \frac{ρ}{π}, \nabla (a a^{T}) \nabla log \frac{ρ}{π})) \frac{1}{π} ρ d x \\ - \int 2 \nabla^{2} log \frac{ρ}{π} (a a^{T} \nabla log \frac{ρ}{π}, z z^{T} \nabla log \frac{ρ}{π}) ρ d x \\ - \int Γ_{1}^{z} ((\nabla log π, a a^{T} \nabla log \frac{ρ}{π}), log \frac{ρ}{π}) ρ d x - \int \frac{1}{ρ} L^{*} ρ L_{z}^{*} ρ d x . \end{matrix}

Notice the fact that

\begin{matrix} (a 2) = & - \int (\nabla (A, \nabla Φ), z z^{T} \nabla Φ) ρ d x \\ = & \int (\nabla (\nabla log π, a a^{T} \nabla Φ), z z^{T} \nabla Φ) ρ d x \\ = & \int Γ_{1} (Γ_{1}^{z} (Φ, Φ), Φ) ρ d x . \end{matrix}

Hence,

\begin{matrix} \int Γ_{1}^{z} (L Φ, Φ) ρ d x = & (a 1) + (a 2) \\ = & \int \nabla \cdot (z z^{T} π ((\nabla log \frac{ρ}{π}, \nabla (a a^{T}) \nabla log \frac{ρ}{π})) \frac{1}{π} ρ d x \\ - \int 2 \nabla^{2} log \frac{ρ}{π} (a a^{T} \nabla log \frac{ρ}{π}, z z^{T} \nabla log \frac{ρ}{π}) ρ d x \\ - \int \frac{1}{ρ} L^{*} ρ L_{z}^{*} ρ d x . \end{matrix}

Similarly, by switching a and z, we have

\begin{matrix} \int Γ_{1} (L_{z} Φ, Φ) ρ d x = & \int \nabla \cdot (a a^{T} π ((\nabla log \frac{ρ}{π}, \nabla (z z^{T}) \nabla log \frac{ρ}{π})) \frac{1}{π} ρ d x \\ - \int 2 \nabla^{2} log \frac{ρ}{π} (a a^{T} \nabla log \frac{ρ}{π}, z z^{T} \nabla log \frac{ρ}{π}) ρ d x \\ - \int \frac{1}{ρ} L^{*} ρ L_{z}^{*} ρ d x . \end{matrix}

Combining the above derivation, we finish the proof. □

Remark 11.

From the proof, we can show the following identity: denote

Φ = δ D

, then

\begin{matrix} \int (Γ_{1}^{z} (L Φ, Φ) - Γ_{1} (L_{z} Φ, Φ)) ρ d x \\ = & \int (Γ_{1} (Γ_{1}^{z} (Φ, Φ), Φ) - Γ_{1}^{z} (Γ_{1} (Φ, Φ), Φ)) ρ d x . \end{matrix}

Therefore, it is clear that, if the commutative assumption

Γ_{1} (Γ_{1}^{z} (Φ, Φ), Φ) = Γ_{1}^{z} (Γ (Φ, Φ), Φ)

holds, the above quantity equals zero. In this case,

\int Γ_{2}^{z, π} (Φ, Φ) ρ d x = \int Γ_{2}^{z} (Φ, Φ) ρ d x .

This means that, under the commutative assumption, the generalized Gamma z calculus agrees with the classical one [2] in the weak sense.

With the generalized Gamma z calculus, we are ready to prove the convergence properties and functional inequalities for degenerate drift–diffusion processes.

Proposition 14.

Suppose

Γ_{2} + Γ_{2}^{z, π} ⪰ κ (Γ_{1} + Γ_{1}^{z})

with

κ > 0

. Denote

ρ_{t}

as the solution of the sub-Riemannian gradient flow (33), then

\frac{d}{d t} (I_{a} (ρ_{t}) + I_{z} (ρ_{t})) \leq - 2 κ (I_{a} (ρ_{t}) + I_{z} (ρ_{t})) .

In addition, the z-log-Sobolev inequalities holds:

\int_{R^{n + m}} ρ log \frac{ρ}{π} d x \leq \frac{1}{2 κ} I_{a, z} (ρ),

for any smooth density function ρ.

Finally,

\int_{R^{n + m}} | ρ (t, x) - π (x) | d x \leq \sqrt{2 D (ρ_{0})} e^{- κ t} .

Proof.

Here, the proof is very similar to the one in the previous section. Again, consider the sub-Riemannian gradient flow in the SDM.

\partial_{t} ρ_{t} = - {grad}_{W_{a}} D (ρ_{t}) .

We know that the log-Sobolev inequality relates to the ratio of

\frac{d}{d t} D (ρ_{t})

and

\frac{d^{2}}{d t^{2}} D (ρ_{t})

. If we cannot estimate a ratio

κ > 0

, then

\frac{d}{d t} I_{a} (ρ_{t}) \leq - 2 κ I_{a} (ρ_{t}) .

We construct the other Lyapunov function:

I_{a, z} (ρ) = I_{a} (ρ) + I_{z} (ρ) .

Thus, along the SDM gradient flow (33), we have

\frac{d}{d t} I_{a, z} (ρ_{t}) = - 2 \int (Γ_{2} (δ D, δ D) + Γ_{2}^{z, π} (δ D, δ D)) ρ_{t} d x .

If

Γ_{2} + Γ_{2}^{z, π} ⪰ κ (Γ_{1} + Γ_{1}^{z})

, then

\frac{d}{d t} I_{a, z} (ρ_{t}) \leq - 2 κ I_{a, z} (ρ_{t}) .

(36)

The convergence result follows directly from Gronwall’s equality.

We next prove the z-log-Sobolev inequality. Since

- \frac{d}{d t} D (ρ_{t}) = I_{a} (ρ_{t}) \leq I_{a, z} (ρ_{t}),

then (36) implies the fact that, denoting

ρ_{0} = ρ

, then

\begin{matrix} - I_{a, z} (ρ) = & \int_{0}^{\infty} \frac{d}{d t} I_{a, z} (ρ_{t}) d t \\ \leq & - 2 κ \int_{0}^{\infty} I_{a, z} (ρ_{t}) d t = - 2 κ \int_{0}^{\infty} (I_{a} (ρ_{t}) + I_{z} (ρ_{t})) d t \\ \leq & - 2 κ \int_{0}^{\infty} I_{a} (ρ_{t}) d t \\ = & - 2 κ \int_{0}^{\infty} (- \frac{d}{d t} D (ρ_{t})) d t \\ = & - 2 κ D (ρ) . \end{matrix}

Thus,

I_{a, z} (ρ) \geq 2 κ D (ρ)

. Hence, we prove all the results by the fact that

R ⪰ κ (Γ_{1} + Γ_{1}^{z})

implies

Γ_{2} + Γ_{2}^{z, π} ⪰ κ (Γ_{1} + Γ_{1}^{z})

. In other words, the generalized Gamma z calculus implies the z-log-Sobolev equality (zLSI):

R ⪰ κ (Γ_{1} + Γ_{1}^{z}) \Rightarrow \frac{d}{d t} I_{a, z} (ρ_{t}) \leq - 2 κ I_{a, z} (ρ_{t}) \Rightarrow zLSI .

We last prove the exponential convergence in the

L^{1}

distance. Notice that

D_{KL} (ρ_{t} ∥ π) \leq \frac{1}{2 λ} I_{a, z} (ρ_{t} ∥ π) \leq \frac{1}{2 λ} e^{- 2 λ t} I_{a, z} (ρ_{0} ∥ π) .

We apply an inequality between the KL divergence and

L_{1}

distance. In other words,

\int_{R^{n + m}} | ρ (t, x) - π (x) | d x \leq \sqrt{2 D_{KL} (ρ ∥ π)} .

This finishes the proof. □

Remark 12.

It is worth mentioning that our derivation of the Gamma z calculus is not a direct Hessian operator of the entropy in the SDM. In fact, it combines both the second-order calculus in the SDM and the property of the

L^{2}

Hessian operator of the entropy. See similar relations in the mean-field Bakry–Émery calculus [50].

5. Generalized Gamma z Calculus

In this section, we introduce the generalized Gamma z calculus. For any smooth functions

f, g : R^{n + m} \to R

, the diffusion operator associated with SDE (2) is denoted as

\begin{matrix} L f = Δ_{a} f - A \nabla f + b \nabla f, \end{matrix}

where we denote

A = a \otimes \nabla a

and

\begin{matrix} Δ_{a} f = \nabla \cdot (a a^{T} \nabla f) . \end{matrix}

When

b = 0

, we denote the diffusion operator as

\begin{matrix} \tilde{L} f & = & Δ_{a} f - a \otimes \nabla a \nabla f . \end{matrix}

We first define the Carré de Champ operator

Γ_{1}

associated with the above second-order diffusion operators. It is easy to check that

Δ_{a}

,

\tilde{L}

, and L share the same

Γ_{1}

:

\begin{matrix} Γ_{1} (f, g) = {〈 a^{T} \nabla f, a^{T} \nabla f 〉}_{R^{n}} . \end{matrix}

(37)

Similarly, we introduce the

Γ_{1}^{z}

operator in the direction of

z = (z_{1}, \dots, z_{m})

below:

\begin{matrix} Γ_{1}^{z} = {〈 z^{T} \nabla f, z^{T} \nabla f 〉}_{R^{m}}, \end{matrix}

(38)

Next, we define the iterative

Γ_{2}

and

Γ_{2}^{z}

for operator L (

\tilde{L}

, respectively) below:

\begin{matrix} Γ_{2, L} (f, f) = \frac{1}{2} L Γ_{1}^{z} (f, f) - Γ_{1}^{z} (L f, f) . \end{matrix}

(39)

\begin{matrix} Γ_{2, L}^{z} (f, f) = \frac{1}{2} L Γ_{1}^{z} (f, f) - Γ_{1}^{z} (L f, f) . \end{matrix}

(40)

Definition 4.

We define the generalized Gamma z for operator L below:

\begin{matrix} Γ_{2, L}^{z, π} (f, f) & = & Γ_{2, L}^{z} (f, f) + {div}_{z}^{π} (Γ_{\nabla (a a^{T})} f, f) - {div}_{a}^{π} (Γ_{\nabla (z z^{T})} f, f), \end{matrix}

(41)

For matrices

a \in R^{n \times (n + m)}

and

z \in R^{m \times (n + m)}

, we denote the divergence operator as

\begin{matrix} {div}_{z}^{π} (Γ_{\nabla (a a^{T})} f, f) & = & \frac{\nabla \cdot (z z^{T} π Γ_{\nabla (a a^{T})} (f, f))}{π}, \\ {div}_{a}^{π} (Γ_{\nabla (z z^{T})} f, f) & = & \frac{\nabla \cdot (a a^{T} π Γ_{\nabla (z z^{T})} (f, f))}{π}, \end{matrix}

and

\begin{matrix} Γ_{\nabla (a a^{T)}} (f, f) = 〈 \nabla f, \nabla (a a^{T}) \nabla f 〉, and Γ_{\nabla (z z^{T)}} (f, f) = 〈 \nabla f, \nabla (z z^{T}) \nabla f 〉 . \end{matrix}

Here, we denote π as the invariant distribution associated with the operator L.

Remark 13.

In particular, we have the following local coordinates representation.

\begin{matrix} 〈 \nabla f, \nabla (a a^{T}) \nabla f 〉 & = & {〈 \nabla f, \frac{\partial}{\partial x_{\hat{k}}} (a a^{T}) \nabla f 〉}_{\hat{k} = 1}^{n + m} = 2 {〈 a^{T} \nabla f, \frac{\partial}{\partial x_{\hat{k}}} a^{T} \nabla f 〉}_{\hat{k} = 1}^{n + m} \\ = & {(2 \sum_{i = 1}^{n} \sum_{\hat{i}, i^{'} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}})}_{\hat{k} = 1}^{n + m}, \\ 〈 \nabla f, \nabla (z z^{T}) \nabla f 〉 & = & {(2 \sum_{j = 1}^{n} \sum_{\hat{j}, j^{'} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{k}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}})}_{\hat{k} = 1}^{n + m} . \end{matrix}

(42)

We first present the following key lemmas.

Lemma 10.

\begin{matrix} {div}_{z}^{π} (Γ_{\nabla (a a^{T})} f, f) - {div}_{a}^{π} (Γ_{\nabla (z z^{T})} f, f) & = & R^{π} (f, f) + 2 G^{T} X, \end{matrix}

where X, G are defined in Notation 1 and

R^{π}

is defined in Definition 2.

Lemma 11.

\begin{matrix} Γ_{2, \tilde{L}} (f, f) = X^{T} Q^{T} Q X + 2 D^{T} Q X + 2 C^{T} X + D^{T} D + R_{a} (\nabla f, \nabla f), \end{matrix}

where

Q, X, C, D

are introduced in Notation 1 and

R_{a}

is defined in Definition 2.

Lemma 12.

\begin{matrix} Γ_{2, \tilde{L}}^{z} (f, f) = X^{T} P^{T} P X + 2 E^{T} P X + 2 F^{T} X + E^{T} E + R_{z} (\nabla f, \nabla f) . \end{matrix}

where

P, X, F, E

are introduced in Notation 1 and

R_{z}

is defined in Definition 2.

We then have the following main theorem. In order to distinguish the operators L and

\tilde{L}

, we rewrite Theorem 1 as below, and with some abuse of notation, we denote

Γ_{2} (f, f) = Γ_{2, L} (f, f)

and

Γ_{2}^{z, π} (f, f) = Γ_{2, L}^{z, π} (f, f)

.

Theorem 3

(z-Bochner’s formula). For smooth function

f : R^{n + m} \to R

, assume that Assumption 1 holds, then

\begin{matrix} Γ_{2, L} (f, f) + Γ_{2, L}^{z, π} (f, f) & = & ∥ {Hess}_{a, z} {f ∥}^{2} + R (\nabla f, \nabla f), \end{matrix}

where

\begin{matrix} ∥ {Hess}_{a, z} {f ∥}^{2} & = & {[X + Λ_{1}]}^{T} Q^{T} Q [X + Λ_{1}] + {[X + Λ_{2}]}^{T} P^{T} P [X + Λ_{2}] \\ R (\nabla f, \nabla f) & = & - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E \\ + R_{a b} (\nabla f, \nabla f) + R_{z b} (\nabla f, \nabla f) + R_{π} (\nabla f, \nabla f) . \end{matrix}

All the terms are defined in Notation 1 and Definition 2.

Proof.

By Definition 4 and Formulae (39) and (40), we have

\begin{matrix} Γ_{2, L} (f, f) + {\tilde{Γ}}_{2, L}^{z, π} (f, f) \\ = & Γ_{2, L} (f, f) + Γ_{2, L}^{z} (f, f) + {div}_{z}^{π} (Γ_{\nabla (a a^{T})} f, f) - {div}_{a}^{π} (Γ_{\nabla (z z^{T})} f, f) . \end{matrix}

We compute the above terms explicitly in the following four steps.

Step 1:

\begin{matrix} Γ_{2, L} (f, f) & = & \frac{1}{2} (L Γ_{1, L} (f, f) - 2 Γ_{1, L} (L f, f)) \\ = & \frac{1}{2} Δ_{a} Γ_{1} (f, f) - \frac{1}{2} A \nabla Γ_{1} (f, f) + \frac{1}{2} b \nabla Γ_{1} (f, f) \\ - Γ_{1} ((Δ_{a} - A \nabla + b \nabla) f, f) \\ = & Γ_{2, \tilde{L}} (f, f) + [\frac{1}{2} b \nabla Γ_{1} (f, f) - Γ_{1} (b \nabla f, f)] . \end{matrix}

The term

Γ_{2, \tilde{L}} (f, f)

follows from Lemma 11. We are left with the other two terms:

\begin{matrix} \frac{1}{2} b \nabla Γ_{1} (f, f) & = & \frac{1}{2} \sum_{\hat{k} = 1}^{n + m} b_{\hat{k}} \frac{\partial}{\partial x_{\hat{k}}} ({〈 a^{T} \nabla f, a^{T} \nabla f 〉}_{R^{n}}) \\ = & \sum_{\hat{k}, \hat{i} = 1}^{n + m} \sum_{i = 1}^{n} (b_{\hat{k}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}} + b_{\hat{k}} a_{i \hat{i}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{k}} \partial x_{\hat{i}}}) {(a^{T} \nabla f)}_{i}, \end{matrix}

and

\begin{matrix} - Γ_{1} (b \nabla f, f) & = & - {〈 a^{T} \nabla (b \nabla f), a^{T} \nabla f 〉}_{R^{n}} \\ = & - \sum_{i = 1}^{n} \sum_{\hat{k}, \hat{i} = 1}^{n + m} (a_{i \hat{i}}^{T} \frac{\partial b_{\hat{k}}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}} + a_{i \hat{i}}^{T} b_{\hat{k}} \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{k}}}) {(a^{T} \nabla f)}_{i} . \end{matrix}

Step 2:

\begin{matrix} Γ_{2, L}^{z} (f, f) & = & \frac{1}{2} (L Γ_{1}^{z} (f, f) - 2 Γ_{1}^{z} (L f, f)) \\ = & \frac{1}{2} Δ_{a} Γ_{1}^{z} (f, f) - \frac{1}{2} A \nabla Γ_{1}^{z} (f, f) + \frac{1}{2} b \nabla Γ_{1}^{z} (f, f) \\ - Γ_{1}^{z} ((Δ_{a} - A \nabla + b \nabla) f, f) \\ = & Γ_{2, \tilde{L}}^{z} (f, f) + [\frac{1}{2} b \nabla Γ_{1}^{z} (f, f) - Γ_{1}^{z} (b \nabla f, f)] . \end{matrix}

The term

Γ_{2, \tilde{L}}^{z} (f, f)

follows from Lemma 12. We are left to compute the last two terms:

\begin{matrix} \frac{1}{2} b \nabla Γ_{1}^{z} (f, f) & = & - \frac{1}{2} \sum_{\hat{k} = 1}^{n + m} b_{\hat{k}} \frac{\partial}{\partial x_{\hat{k}}} ({〈 z^{T} \nabla f, z^{T} \nabla f 〉}_{R^{m}}) \\ = & \sum_{\hat{k}, \hat{i} = 1}^{n + m} \sum_{i = 1}^{m} (b_{\hat{k}} \frac{\partial z_{i \hat{i}}^{T}}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}} + b_{\hat{k}} z_{i \hat{i}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{k}} \partial x_{\hat{i}}}) {(z^{T} \nabla f)}_{i}, \end{matrix}

and

\begin{matrix} - Γ_{1}^{z} (b \nabla f, f) & = & - {〈 z^{T} \nabla (b \nabla f), z^{T} \nabla f 〉}_{R^{n}} \\ = & - \sum_{i = 1}^{m} \sum_{\hat{k}, \hat{i} = 1}^{n + m} (z_{i \hat{i}}^{T} \frac{\partial b_{\hat{k}}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}} + z_{i \hat{i}}^{T} b_{\hat{k}} \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{k}}}) {(z^{T} \nabla f)}_{i} . \end{matrix}

Step 3: Following Lemma 10, which will be proven shortly in the next section, we have

\begin{matrix} {div}_{z}^{π} (Γ_{\nabla (a a^{T})} f, f) - {div}_{a}^{π} (Γ_{\nabla (z z^{T})} f, f) & = & R^{π} (f, f) + 2 G^{T} X, \end{matrix}

where X, G are defined in Notation 1 and

R^{π}

is defined in Definition 2.

Step 4: Combining the above terms

Γ_{2, \tilde{L}} (f, f)

in Lemma 11,

Γ_{2, \tilde{L}}^{z} (f, f)

in Lemma 12, and

R^{π} (f, f) + 2 G^{T} X

, we have

\begin{matrix} Γ_{2, \tilde{L}} (f, f) + Γ_{2, \tilde{L}}^{z} (f, f) + R^{π} (f, f) + 2 G^{T} X \\ = & X^{T} P^{T} P X + 2 E^{T} P X + 2 F^{T} X + E^{T} E \\ + X^{T} Q^{T} Q X + 2 D^{T} Q X + 2 C^{T} X + D^{T} D + 2 G^{T} X \\ + R_{a} (\nabla f, \nabla f) + R_{z} (\nabla f, \nabla f) + R_{π} (\nabla f, \nabla f) \\ = & X^{T} [P^{T} P + Q^{T} Q] X + 2 [G^{T} + F^{T} + C^{T}] X + 2 [E^{T} P + D^{T} Q] X + D^{T} D + E^{T} E \\ + R_{a} (\nabla f, \nabla f) + R_{z} (\nabla f, \nabla f) + R_{π} (\nabla f, \nabla f) . \end{matrix}

Assuming that Assumption 1 is satisfied, we obtain

\begin{matrix} Γ_{2, \tilde{L}} (f, f) + Γ_{2, \tilde{L}}^{z} (f, f) + R^{π} (f, f) + 2 G^{T} X \\ = & {[X + Λ_{1}]}^{T} Q^{T} Q [X + Λ_{1}] + {[X + Λ_{2}]}^{T} P^{T} P [X + Λ_{2}] \\ + R_{a} (\nabla f, \nabla f) + R_{z} (\nabla f, \nabla f) + R^{π} (\nabla f, \nabla f) \\ - Λ_{1}^{T} Q^{T} Q Λ_{1} - Λ_{2}^{T} P^{T} P Λ_{2} + D^{T} D + E^{T} E . \end{matrix}

Adding the drift terms from Step 1 and Step 2, we obtain

R_{a b}

and

R_{z b}

, which finishes the proof. □

5.1. Proof of Lemma 10

Lemma 13.

\begin{matrix} {div}_{z}^{π} (Γ_{\nabla (a a^{T})} f, f) - {div}_{a}^{π} (Γ_{\nabla (z z^{T})} f, f) & = & R^{π} (f, f) + 2 G^{T} X, \end{matrix}

(43)

where X, G are defined in Notation 1 and

R^{π}

is defined in Definition 2.

Proof.

For the first term in the above lemma, we have

\begin{matrix} {div}_{z}^{π} (Γ_{\nabla (a a^{T})} f, f) & = & \frac{\nabla \cdot (z z^{T} π Γ_{\nabla (a a^{T)}} (f, f))}{π} \\ = & \sum_{k^{'} = 1}^{n + m} \frac{1}{π} \frac{\partial}{\partial x_{k^{'}}} [\sum_{k = 1}^{m} z_{k^{'} k} (π \sum_{\hat{k} = 1}^{n + m} z_{k \hat{k}}^{T} {(Γ_{\nabla (a a^{T)}} (f, f)))}_{\hat{k}})] \\ = & \sum_{k^{'} = 1}^{n + m} \sum_{k = 1}^{m} [\frac{\partial}{\partial x_{k^{'}}} z_{k^{'} k} (\sum_{\hat{k} = 1}^{n + m} z_{k \hat{k}}^{T} {(Γ_{\nabla (a a^{T)}} (f, f)))}_{\hat{k}}) \\ + z_{k^{'} k} \frac{\partial}{\partial x_{k^{'}}} (\sum_{\hat{k} = 1}^{n + m} z_{k \hat{k}}^{T} {(Γ_{\nabla (a a^{T)}} (f, f)))}_{\hat{k}})] \end{matrix}

\begin{matrix} + \sum_{k^{'} = 1}^{n + m} \sum_{k = 1}^{m} \frac{\partial}{\partial x_{k^{'}}} log π [z_{k^{'} k} \sum_{\hat{k} = 1}^{n + m} z_{k \hat{k}}^{T} {(Γ_{\nabla (a a^{T)}} (f, f)))}_{\hat{k}}] \\ = & \sum_{k^{'} = 1}^{n + m} \sum_{k = 1}^{m} [\frac{\partial}{\partial x_{k^{'}}} z_{k k^{'}}^{T} (\sum_{\hat{k} = 1}^{n + m} z_{k \hat{k}}^{T} {(Γ_{\nabla (a a^{T)}} (f, f)))}_{\hat{k}}) \\ + z_{k k^{'}}^{T} \frac{\partial}{\partial x_{k^{'}}} (\sum_{\hat{k} = 1}^{n + m} z_{k \hat{k}}^{T} {(Γ_{\nabla (a a^{T)}} (f, f)))}_{\hat{k}})] \\ + \sum_{k = 1}^{m} {(z^{T} \nabla log π)}_{k} [\sum_{\hat{k} = 1}^{n + m} z_{k \hat{k}}^{T} {(Γ_{\nabla (a a^{T)}} (f, f)))}_{\hat{k}}], \end{matrix}

where

Γ_{\nabla (a a^{T)}} {(f, f))}_{\hat{k}}

is defined in (42). Plugging in (42), we further obtain

\begin{matrix} {div}_{z}^{π} (Γ_{\nabla (a a^{T})} f, f) \\ = & \sum_{k^{'} = 1}^{n + m} \sum_{k = 1}^{m} [\frac{\partial}{\partial x_{k^{'}}} z_{k k^{'}}^{T} (\sum_{\hat{k} = 1}^{n + m} z_{k \hat{k}}^{T} (2 \sum_{i = 1}^{n} \sum_{\hat{i}, i^{'} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}}))] \\ + \sum_{k^{'} = 1}^{n + m} \sum_{k = 1}^{m} [z_{k k^{'}}^{T} \frac{\partial}{\partial x_{k^{'}}} (\sum_{\hat{k} = 1}^{n + m} z_{k \hat{k}}^{T} (2 \sum_{i = 1}^{n} \sum_{\hat{i}, i^{'} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}}))] \\ + \sum_{k = 1}^{m} {(z^{T} \nabla log π)}_{k} [\sum_{\hat{k} = 1}^{n + m} z_{k \hat{k}}^{T} (2 \sum_{i = 1}^{n} \sum_{\hat{i}, i^{'} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}})] \\ = & 2 \sum_{k = 1}^{m} \sum_{i = 1}^{n} \sum_{k^{'}, \hat{k}, \hat{i}, i^{'} = 1}^{n + m} [\frac{\partial}{\partial x_{k^{'}}} z_{k k^{'}}^{T} z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}}] \dots S_{1}^{z} \\ + 2 \sum_{k = 1}^{m} \sum_{i = 1}^{n} \sum_{k^{'}, \hat{k}, \hat{i}, i^{'} = 1}^{n + m} [z_{k k^{'}}^{T} \frac{\partial}{\partial x_{k^{'}}} (z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}})] \dots S_{2}^{z} \\ + 2 \sum_{k = 1}^{m} \sum_{i = 1}^{n} \sum_{\hat{k}, \hat{i}, i^{'} = 1}^{n + m} {(z^{T} \nabla log π)}_{k} [z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}}] \dots S_{3}^{z} \\ = & S_{1}^{z} + S_{2}^{z} + S_{3}^{z} . \end{matrix}

(44)

By further expanding

S_{2}^{z}

, we obtain

\begin{matrix} S_{2}^{z} & = & 2 \sum_{k = 1}^{m} \sum_{i = 1}^{n} \sum_{k^{'}, \hat{k}, \hat{i}, i^{'} = 1}^{n + m} [z_{k k^{'}}^{T} \frac{\partial}{\partial x_{k^{'}}} (z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}})] \\ = & 2 \sum_{k = 1}^{m} \sum_{i = 1}^{n} \sum_{k^{'}, \hat{k}, \hat{i}, i^{'} = 1}^{n + m} [z_{k k^{'}}^{T} \frac{\partial}{\partial x_{k^{'}}} z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} \\ + z_{k k^{'}}^{T} z_{k \hat{k}}^{T} \frac{\partial^{2}}{\partial x_{k^{'}} \partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} \\ + z_{k k^{'}}^{T} z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial^{2} f}{\partial x_{k^{'}} \partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} \\ + z_{k k^{'}}^{T} z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial^{2} f}{\partial x_{k^{'}} \partial x_{i^{'}}} \\ + z_{k k^{'}}^{T} z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} \frac{\partial}{\partial x_{k^{'}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}}] . \end{matrix}

Similarly, we obtain

\begin{matrix} {div}_{a}^{π} (Γ_{\nabla (z z^{T})} f, f) \\ = & \sum_{l^{'} = 1}^{n + m} \sum_{l = 1}^{n} [\frac{\partial}{\partial x_{l^{'}}} a_{l l^{'}}^{T} (\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} {(Γ_{\nabla (z z^{T)}} (f, f)))}_{\hat{l}}) \\ + a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} (\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} {(Γ_{\nabla (z z^{T)}} (f, f)))}_{\hat{l}})] \\ + \sum_{l = 1}^{n} {(a^{T} \nabla log π)}_{l} [\sum_{\hat{l} = 1}^{n + m} {(a_{l \hat{l}}^{T} Γ_{\nabla (z z^{T)}} (f, f)))}_{\hat{l}}] \\ = & 2 \sum_{j = 1}^{m} \sum_{l = 1}^{n} \sum_{l^{'}, \hat{l}, \hat{j}, j^{'} = 1}^{n + m} [\frac{\partial}{\partial x_{l^{'}}} a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}}] \dots S_{1}^{a} \\ + 2 \sum_{j = 1}^{m} \sum_{l = 1}^{n} \sum_{l^{'}, \hat{l}, \hat{j}, j^{'} = 1}^{n + m} [a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} (a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}})] \dots S_{2}^{a} \\ + 2 \sum_{j = 1}^{m} \sum_{l = 1}^{n} \sum_{\hat{l}, \hat{j}, j^{'} = 1}^{n + m} {(a^{T} \nabla log π)}_{l} [a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}}] \dots S_{3}^{a} \end{matrix}

(45)

\begin{matrix} = & S_{1}^{a} + S_{2}^{a} + S_{3}^{a}, \end{matrix}

(46)

where we also obtain

\begin{matrix} S_{2}^{a} & = & 2 \sum_{j = 1}^{m} \sum_{l = 1}^{n} \sum_{l^{'}, \hat{l}, \hat{j}, j^{'} = 1}^{n + m} [a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} (a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}})] \\ = & 2 \sum_{j = 1}^{m} \sum_{l = 1}^{n} \sum_{l^{'}, \hat{l}, \hat{j}, j^{'} = 1}^{n + m} [a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} \\ + a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial^{2}}{\partial x_{l^{'}} \partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} \\ + a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial^{2} f}{\partial x_{l^{'}} \partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} \\ + a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial^{2} f}{\partial x_{l^{'}} \partial x_{j^{'}}} \\ + a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} \frac{\partial}{\partial x_{l^{'}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}}] . \end{matrix}

Combining all the terms above, we have

\begin{matrix} {div}_{z}^{π} (Γ_{\nabla (a a^{T})} f, f) - {div}_{a}^{π} (Γ_{\nabla (z z^{T})} f, f) = S_{1}^{z} + S_{2}^{z} + S_{3}^{z} - (S_{1}^{a} + S_{2}^{a} + S_{3}^{a}) . \end{matrix}

By direct computations, we separate the above terms into two groups based on “

\partial f \partial f

” and “

\partial^{2} f \partial f

”. We denote

R^{π} (f, f)

as the sum of all “

\partial f \partial f

” terms and denote

2 G^{T} X

as the sum of all “

\partial^{2} f \partial f

” terms. Switching indices for the terms in

2 G^{T} X

to match

\frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{j}}}

, we obtain the following:

\begin{matrix} 2 G^{T} X \\ = & 2 \sum_{k = 1}^{m} \sum_{i = 1}^{n} \sum_{k^{'}, \hat{k}, \hat{i}, i^{'} = 1}^{n + m} [z_{k k^{'}}^{T} z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial^{2} f}{\partial x_{k^{'}} \partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} + z_{k k^{'}}^{T} z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial^{2} f}{\partial x_{k^{'}} \partial x_{i^{'}}}] \\ - 2 \sum_{j = 1}^{m} \sum_{l = 1}^{n} \sum_{l^{'}, \hat{l}, \hat{j}, j^{'} = 1}^{n + m} [a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial^{2} f}{\partial x_{l^{'}} \partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} + a_{l l^{'}}^{T} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial^{2} f}{\partial x_{l^{'}} \partial x_{j^{'}}}] \\ = & 2 \sum_{j = 1}^{m} \sum_{i = 1}^{n} \sum_{j^{'}, \hat{j}, \hat{i}, i^{'} = 1}^{n + m} [z_{j j^{'}}^{T} z_{j \hat{j}}^{T} \frac{\partial}{\partial x_{\hat{j}}} a_{i \hat{i}}^{T} \frac{\partial^{2} f}{\partial x_{j^{'}} \partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} + z_{j j^{'}}^{T} z_{j \hat{j}}^{T} \frac{\partial}{\partial x_{\hat{j}}} a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial^{2} f}{\partial x_{j^{'}} \partial x_{i^{'}}}] \\ - 2 \sum_{j = 1}^{m} \sum_{i = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{j}, j^{'} = 1}^{n + m} [a_{i i^{'}}^{T} a_{i \hat{i}}^{T} \frac{\partial}{\partial x_{\hat{i}}} z_{j \hat{j}}^{T} \frac{\partial^{2} f}{\partial x_{i^{'}} \partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} + a_{i i^{'}}^{T} a_{i \hat{i}}^{T} \frac{\partial}{\partial x_{\hat{i}}} z_{j \hat{j}}^{T} \frac{\partial f}{\partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial^{2} f}{\partial x_{i^{'}} \partial x_{j^{'}}}] \\ = & 2 \sum_{j = 1}^{m} \sum_{i = 1}^{n} \sum_{j^{'}, \hat{j}, \hat{i}, i^{'} = 1}^{n + m} [z_{j \hat{j}}^{T} z_{j j^{'}}^{T} \frac{\partial}{\partial x_{j^{'}}} a_{i \hat{i}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{j}} \partial x_{\hat{i}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} + z_{j \hat{j}}^{T} z_{j j^{'}}^{T} \frac{\partial}{\partial x_{j^{'}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} a_{i \hat{i}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{j}} \partial x_{\hat{i}}}] \\ - 2 \sum_{j = 1}^{m} \sum_{i = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{j}, j^{'} = 1}^{n + m} [a_{i \hat{i}}^{T} a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}} z_{j \hat{j}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{j}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} + a_{i \hat{i}}^{T} a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} z_{j \hat{j}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{j}}}] \\ = & 2 \sum_{\hat{i}, \hat{j} = 1}^{n + m} \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{j}}} [\sum_{i = 1}^{n} \sum_{j = 1}^{m} \sum_{j^{'}, \hat{j}, i^{'}, \hat{i} = 1}^{n + m} [(z_{j \hat{j}}^{T} z_{j j^{'}}^{T} \frac{\partial}{\partial x_{j^{'}}} a_{i \hat{i}}^{T} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} + z_{j \hat{j}}^{T} z_{j j^{'}}^{T} \frac{\partial}{\partial x_{j^{'}}} a_{i i^{'}}^{T} \frac{\partial f}{\partial x_{i^{'}}} a_{i \hat{i}}^{T}) \\ - (a_{i \hat{i}}^{T} a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}} z_{j \hat{j}}^{T} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} + a_{i \hat{i}}^{T} a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}} z_{j j^{'}}^{T} \frac{\partial f}{\partial x_{j^{'}}} z_{j \hat{j}}^{T})]] . \end{matrix}

The first equality follows from the quantities we obtained previously, the second equality from switching

“ k

” to

“ j

” and

“ l

” to

“ i

”, and the third equality from switching between

“ i^{'}

” and

“ \hat{i}

”,

“ j^{'}

” and

“ \hat{j}

”. Thus, the proof is completed. □

5.2. Proof of Lemma 11

From now on, we keep the following notation:

a^{T} \nabla f = \sum_{i = 1}^{n} \sum_{\hat{i} = 1}^{n + m} a_{i \hat{i}}^{T} \frac{\partial}{\partial x_{\hat{i}}} f

. Furthermore, we fixed the notation for

a, a^{T}

with relation

a_{\hat{i} i} = a_{i \hat{i}}^{T}

for

i = 1, \dots, n

and

\hat{i} = 1, \dots, n + m .

Here, we denote

a_{i \hat{i}}^{T} : = {(a^{T})}_{i \hat{i}} .

Recall that we define

\begin{matrix} Γ_{2, \tilde{L}} (f, f) = \frac{1}{2} (\tilde{L} Γ_{1} (f, f) - 2 Γ_{1} (\tilde{L} f, f)) . \end{matrix}

Next, we are ready to prove the following lemma.

Lemma 14.

\begin{matrix} Γ_{2, \tilde{L}} (f, f) = X^{T} Q^{T} Q X + 2 D^{T} Q X + 2 C^{T} X + D^{T} D + R_{a} (\nabla f, \nabla f), \end{matrix}

where

Q, X, C, D

are introduced in Notation 1 and

R_{a}

is defined in Definition 2.

Proof.

We plug in the operator

\tilde{L}

into our definition for

Γ_{2}

:

\begin{matrix} Γ_{2, \tilde{L}} (f, f) & = & \frac{1}{2} Δ_{a} Γ_{1} (f, f) - \frac{1}{2} A \nabla Γ_{1} (f, f) - Γ_{1} ((Δ_{a} - A \nabla) f, f) \\ = & \frac{1}{2} Δ_{a} Γ_{1} (f, f) - Γ_{1} (Δ_{a} f, f) - \frac{1}{2} A \nabla Γ_{1} (f, f) + Γ_{1} (A \nabla f, f) . \end{matrix}

Now, we compute the last two terms of the above equation. With

A = a \otimes \nabla a

, we obtain

\begin{matrix} - \frac{1}{2} A \nabla Γ_{1} (f, f) & = & - \frac{1}{2} \sum_{\hat{k} = 1}^{n + m} A_{\hat{k}} \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} {〈 a^{T} \nabla f, a^{T} \nabla f 〉}_{R^{n}} \\ = & - \sum_{\hat{k} = 1}^{n + m} 〈 A_{\hat{k}} (\nabla_{\frac{\partial}{\partial x_{\hat{k}}}} a^{T}) \nabla f, a^{T} \nabla f 〉_{R^{n}} - \sum_{\hat{k} = 1}^{n + m} 〈 A_{\hat{k}} a^{T} (\nabla_{\frac{\partial}{\partial x_{\hat{k}}}} \nabla f), a^{T} \nabla f 〉_{R^{n}} \\ = & J_{1} + J_{2}, \end{matrix}

and

\begin{matrix} Γ_{1} (A \nabla f, f) & = & 〈 a^{T} \nabla (\sum_{\hat{k} = 1}^{n + m} A_{\hat{k}} \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} f), a^{T} \nabla f 〉_{R^{n}} \\ = & {〈 a^{T} (\sum_{\hat{k} = 1}^{n + m} A_{\hat{k}} \nabla \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} f), a^{T} \nabla f 〉}_{R^{n}} + 〈 a^{T} (\sum_{\hat{k} = 1}^{n + m} \nabla A_{\hat{k}} \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} f), a^{T} \nabla f 〉_{R^{n}} \\ = & J_{3} + J_{4} . \end{matrix}

It is easy to see

\begin{matrix} J_{2} + J_{3} = 0 . \end{matrix}

We now expand

J_{1}

and

J_{4}

into local coordinates:

\begin{matrix} J_{1} = - \sum_{l = 1}^{n} {(a^{T} \nabla f)}_{l} (\sum_{l^{'}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \sum_{k^{'} = 1}^{n + m} a_{\hat{k} k} \nabla_{\frac{\partial}{\partial x_{k^{'}}}} a_{k^{'} k} \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} a_{l l^{'}} \nabla_{\frac{\partial}{\partial x_{l^{'}}}} f), \end{matrix}

(47)

and

\begin{matrix} J_{4} & = & \sum_{l = 1}^{n} {(a^{T} \nabla f)}_{l} (\sum_{l^{'} = 1}^{n + m} a_{l l^{'}}^{T} (\sum_{\hat{k} = 1}^{n + m} \nabla_{\frac{\partial}{\partial x_{l^{'}}}} (\sum_{k = 1}^{n} \sum_{k^{'} = 1}^{n + m} a_{\hat{k} k} \nabla_{\frac{\partial}{\partial x_{k^{'}}}} a_{k^{'} k}) \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} f)) \\ = & \sum_{l = 1}^{n} {(a^{T} \nabla f)}_{l} (\sum_{k = 1}^{n} \sum_{l^{'} = 1}^{n + m} \sum_{\hat{k}, k^{'} = 1}^{n + m} a_{l l^{'}}^{T} \nabla_{\frac{\partial}{\partial x_{l^{'}}}} a_{\hat{k} k} \nabla_{\frac{\partial}{\partial x_{k^{'}}}} a_{k^{'} k} \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} f) \\ + \sum_{l = 1}^{n} {(a^{T} \nabla f)}_{l} (\sum_{k = 1}^{n} \sum_{l^{'} = 1}^{n + m} \sum_{\hat{k}, k^{'} = 1}^{n + m} a_{l l^{'}}^{T} a_{\hat{k} k} (\nabla_{\frac{\partial}{\partial x_{l^{'}}}} \nabla_{\frac{\partial}{\partial x_{k^{'}}}} a_{k^{'} k}) \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} f) . \end{matrix}

(48)

Applying Lemma 15, which will be proven shortly below, we have

\begin{matrix} \frac{1}{2} Δ_{a} Γ_{1} (f, f) - Γ_{1} (Δ_{a} f, f) \\ = & \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | a^{T} {\nabla f |}^{2})) - {〈 a^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), a^{T} \nabla f 〉}_{R^{n}} \\ + \sum_{l = 1}^{n} {(a^{T} \nabla f)}_{l} (\sum_{\hat{i}, \hat{k}, l^{'} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} a_{k \hat{k}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} f) - a_{l l^{'}}^{T} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\frac{\partial}{\partial x_{l^{'}}} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f))) \\ - {〈 B_{n \times n} a^{T} \nabla f, a^{T} \nabla f 〉}_{R^{n}}, \end{matrix}

where

\begin{matrix} {〈 B_{n \times n} a^{T} \nabla f, a^{T} \nabla f 〉}_{R^{n}} = \sum_{l = 1}^{n} {(a^{T} \nabla f)}_{l} (\sum_{k = 1}^{n} \sum_{l = 1}^{n} \sum_{i = 1}^{n + m} \sum_{k^{'}, j^{'} = 1}^{n + m} a_{l j^{'}}^{T} \frac{\partial^{2}}{\partial x_{i} x_{j}^{'}} a_{i k} (a_{k k^{'}}^{T} \frac{\partial}{\partial x_{k^{'}}} f)) . \end{matrix}

Thus, combining with (47) and (48), we have

\begin{matrix} Γ_{2, \tilde{L}} (f, f) & = & \frac{1}{2} Δ_{a} Γ_{1} (f, f) - Γ_{1} (Δ_{a} f, f) + J_{1} + J_{4} \\ = & \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | a^{T} {\nabla f |}^{2})) - {〈 a^{T} \nabla [(a^{T} \nabla) \circ (a^{T} \nabla f)], a^{T} \nabla f 〉}_{R^{n}} . \end{matrix}

where the last term follows from Lemma 16 below. The proof is thus completed. □

Lemma 15.

\begin{matrix} \frac{1}{2} Δ_{a} Γ_{1} (f, f) - Γ_{1} (Δ_{a} f, f) \\ = & \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | a^{T} {\nabla f |}^{2})) - {〈 a^{T} \nabla ([(a^{T} \nabla) \circ (a^{T} \nabla f)]), a^{T} \nabla f 〉}_{R^{n}} \\ - {〈 B_{n \times n} a^{T} \nabla f, a^{T} \nabla f 〉}_{R^{n}} + B_{0} . \end{matrix}

(49)

Here, the local representations for

B_{n \times n}

and

B_{0}

are given as follows. For

l, k = 1, \dots, n

, we denote

\begin{matrix} B_{l k} & = & \sum_{j^{'} = 1}^{n + m} a_{l j^{'}}^{T} \sum_{i = 1}^{n + m} \frac{\partial^{2}}{\partial x_{i} \partial x_{j^{'}}} a_{i k} = \sum_{j^{'} = 1}^{n + m} a_{l j^{'}}^{T} \sum_{i = 1}^{n + m} \frac{\partial^{2}}{\partial x_{i} \partial x_{j^{'}}} a_{k i}^{T}, \\ B_{0} & = & \sum_{l = 1}^{n} {(a^{T} \nabla f)}_{l} (\sum_{\hat{i}, \hat{k}, l^{'} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} a_{k \hat{k}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} f) \\ - a_{l l^{'}}^{T} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\frac{\partial}{\partial x_{l^{'}}} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f))) . \end{matrix}

(50)

We introduce the following notation (convention) that, for any function F,

\begin{matrix} (a^{T} \nabla) \circ (a^{T} \nabla F) = \sum_{i = 1}^{n} {(a^{T} \nabla)}_{i} {(a^{T} \nabla F)}_{i} = \sum_{i = 1}^{n} \sum_{\hat{i}, i^{'} = 1}^{n + m} (a_{i \hat{i}}^{T} \frac{\partial}{\partial x_{\hat{i}}}) (a_{i i^{'}}^{T} \frac{\partial F}{\partial x_{i^{'}}}) . \end{matrix}

(51)

Proof of Lemma 15.

By our definition above, we have

\begin{matrix} Δ_{a} Γ_{1} (f, f) & = & \nabla \cdot (a a^{T} \nabla {〈 a^{T} \nabla f, a^{T} \nabla f 〉}_{R^{n}}) \\ = & \nabla \cdot (a F) \\ = & \sum_{\hat{i} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{i}}} (\sum_{k = 1}^{n} a_{\hat{i} k} F_{k}) \\ = & \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} F_{k} + a_{\hat{i} k} \frac{\partial}{\partial x_{\hat{i}}} F_{k}) \\ = & \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} F_{k}) + a^{T} \nabla \circ (a^{T} \nabla {(a^{T} \nabla f)}^{2}), \end{matrix}

where we denote

\begin{matrix} F & = & a^{T} \nabla {〈 a^{T} \nabla f, a^{T} \nabla f 〉}_{R^{n}} \\ = & a^{T} \nabla \sum_{l = 1}^{n} {(\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)}^{2} \\ = & {(\sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{j}}} \sum_{l = 1}^{n} {(\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)}^{2})}_{k = 1, \dots, n} = {(F_{1}, F_{2}, \dots, F_{n})}^{T} . \end{matrix}

Therefore, we have

\begin{matrix} Δ_{a} Γ_{1} (f, f) & = & \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{j}}} \sum_{l = 1}^{n} {(\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)}^{2})) \\ + a^{T} \nabla \circ (a^{T} \nabla {(a^{T} \nabla f)}^{2}) \\ = & \sum_{k = 1}^{n} \sum_{\hat{i} = 1}^{n + m} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} {(a^{T} \nabla f)}^{2})) \\ + (a^{T} \nabla) \circ (a^{T} \nabla {(a^{T} \nabla f)}^{2}) \\ = & \nabla a \circ (a^{T} \nabla {(a^{T} \nabla f)}^{2}) + (a^{T} \nabla) \circ (a^{T} \nabla {(a^{T} \nabla f)}^{2}) . \end{matrix}

(52)

Next, we compute the following quantity.

\begin{matrix} Γ_{1} (Δ_{a} f, f) & = & {〈 a^{T} \nabla (\nabla \cdot (a a^{T} \nabla f)), a^{T} \nabla f 〉}_{R^{n}}, \end{matrix}

where we have

\begin{matrix} \nabla \cdot (a a^{T} \nabla f) = \nabla \cdot (\sum_{k = 1}^{n} \sum_{\hat{k} = 1}^{n + m} a_{\hat{i} k} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f) = \sum_{\hat{i} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{i}}} (\sum_{k = 1}^{n} \sum_{\hat{k} = 1}^{n + m} a_{\hat{i} k} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f) \end{matrix}

\begin{matrix} = & \sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f) + \sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} a_{\hat{i} k} \frac{\partial}{\partial x_{\hat{i}}} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{j}} f) \\ = & \sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f) + (a^{T} \nabla) \circ (a^{T} \nabla f) \\ = & \nabla a \circ (a^{T} \nabla f) + (a^{T} \nabla) \circ (a^{T} \nabla f) . \end{matrix}

We continue with our computation as below:

\begin{matrix} Γ_{1} (Δ_{a} f, f) \\ = & {〈 a^{T} \nabla [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f) + (a^{T} \nabla) \circ (a^{T} \nabla f)], a^{T} \nabla f 〉}_{R^{n}} \\ = & {〈 a^{T} \nabla [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)], a^{T} \nabla f 〉}_{R^{n}} \\ + {〈 a^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), a^{T} \nabla f 〉}_{R^{n}} \\ = & {〈 a^{T} \nabla (\nabla a \cdot (a^{T} \nabla f)), a^{T} \nabla f 〉}_{R^{n}} + {〈 a^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), a^{T} \nabla f 〉}_{R^{n}} \\ = & {〈 (a^{T} \nabla \nabla a \cdot (a^{T} \nabla f)), a^{T} \nabla f 〉}_{R^{n}} + {〈 (\nabla a \cdot (a^{T} \nabla (a^{T} \nabla f))), a^{T} \nabla f 〉}_{R^{n}} \\ + {〈 a^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), a^{T} \nabla f 〉}_{R^{n}} . \end{matrix}

(53)

From the above, combining (52) and (53), we further obtain

\begin{matrix} \frac{1}{2} Δ_{a} Γ_{1} (f, f) - Γ_{1} (Δ_{a} f, f) \\ = & \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | a^{T} {\nabla f |}^{2})) - {〈 a^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), a^{T} \nabla f 〉}_{R^{n}} \end{matrix}

\begin{matrix} + \frac{1}{2} \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{j}}} \sum_{l = 1}^{n} {(\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)}^{2})) \\ - {〈 a^{T} \nabla [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)], a^{T} \nabla f 〉}_{R^{n}} \\ = & \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | a^{T} {\nabla f |}^{2})) - {〈 a^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), a^{T} \nabla f 〉}_{R^{n}} \\ + \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \sum_{l = 1}^{n} (\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) \frac{\partial}{\partial x_{\hat{k}}} (\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)) \dots I \\ - \sum_{l = 1}^{n} ((\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \dots II . \end{matrix}

Recall that we denote

a^{T}

to emphasize the transpose of the matrix a and

a_{i \hat{i}}^{T} = a_{\hat{i} i}

:

\begin{matrix} I & = & \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \sum_{l = 1}^{n} (\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) \frac{\partial}{\partial x_{\hat{k}}} (\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)) \\ = & \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \sum_{l = 1}^{n} (\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{\hat{l} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{k}}} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)) \\ + \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \sum_{l = 1}^{n} (\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{k}}} \frac{\partial}{\partial x_{\hat{l}}} f)) \\ = & \sum_{l = 1}^{n} (\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} (\sum_{l^{'} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{k}}} a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} f)) \\ + \sum_{l = 1}^{n} (\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} (\sum_{\hat{l} = 1}^{n + m} a_{l l^{'}}^{T} \frac{\partial}{\partial x_{\hat{k}}} \frac{\partial}{\partial x_{l^{'}}} f)), \end{matrix}

and

\begin{matrix} II & = & \sum_{l = 1}^{n} ((\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ = & \sum_{l = 1}^{n} ((\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} a_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \frac{\partial}{\partial x_{l^{'}}} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ + \sum_{l = 1}^{n} ((\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} a_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial^{2}}{\partial x_{\hat{i}} x_{l^{'}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ = & \sum_{l = 1}^{n} ((\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} a_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\frac{\partial}{\partial x_{l^{'}}} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ + \sum_{l = 1}^{n} ((\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} a_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{l^{'}}} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ + \sum_{l = 1}^{n} ((\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} a_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial^{2}}{\partial x_{\hat{i}} x_{l^{'}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) . \end{matrix}

Subtracting the above two terms, we have

\begin{matrix} I - II \\ = & - \sum_{l = 1}^{n} ((\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} a_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial^{2}}{\partial x_{\hat{i}} x_{l^{'}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ + \sum_{l = 1}^{n} (\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} (\sum_{l^{'} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{k}}} a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} f)) \\ - \sum_{l = 1}^{n} ((\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} a_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\frac{\partial}{\partial x_{l^{'}}} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ = & - \sum_{l = 1}^{n} ((\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} a_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial^{2}}{\partial x_{\hat{i}} x_{l^{'}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ + \sum_{l = 1}^{n} (\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{\hat{i}, \hat{k}, l^{'} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} a_{k \hat{k}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} f) \\ - a_{l l^{'}}^{T} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\frac{\partial}{\partial x_{l^{'}}} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f))) \\ = & - \sum_{l = 1}^{n} ((\sum_{\hat{l} = 1}^{n + m} a_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} a_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial^{2}}{\partial x_{\hat{i}} x_{l^{'}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ + \sum_{l = 1}^{n} {(a^{T} \nabla f)}_{l} (\sum_{\hat{i}, \hat{k}, l^{'} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} a_{k \hat{k}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} f) \\ - a_{l l^{'}}^{T} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\frac{\partial}{\partial x_{l^{'}}} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f))) . \end{matrix}

Now, we eventually obtain the the following step:

\begin{matrix} \frac{1}{2} Δ_{a} Γ_{1} (f, f) - Γ_{1} (Δ_{a} f, f) \\ = & \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | a^{T} {\nabla f |}^{2})) - {〈 a^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), a^{T} \nabla f 〉}_{R^{n}} \\ - 〈 \sum_{k = 1}^{n} \sum_{l = 1}^{n} \sum_{i = 1}^{n + m} \sum_{j^{'} = 1}^{n + m} a_{l j^{'}}^{T} \frac{\partial^{2}}{\partial x_{i} x_{j}^{'}} a_{i k} {(a^{T} \nabla f)}_{k}, a^{T} \nabla f 〉 \\ + \sum_{l = 1}^{n} {(a^{T} \nabla f)}_{l} (\sum_{\hat{i}, \hat{k}, l^{'} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} a_{k \hat{k}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} f) - a_{l l^{'}}^{T} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\frac{\partial}{\partial x_{l^{'}}} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f))) \\ = & \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | a^{T} {\nabla f |}^{2})) \\ - {〈 a^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), a^{T} \nabla f 〉}_{R^{n}} - {〈 B_{n \times n} a^{T} \nabla f, a^{T} \nabla f 〉}_{R^{n}} \\ + \sum_{l = 1}^{n} {(a^{T} \nabla f)}_{l} (\sum_{\hat{i}, \hat{k}, l^{'} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} a_{k \hat{k}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} a_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} f) - a_{l l^{'}}^{T} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\frac{\partial}{\partial x_{l^{'}}} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f))) . \end{matrix}

Thus, the proof is completed. □

Below, we further investigate the extra term explicitly in the above Lemma 15.

Lemma 16.

\begin{matrix} \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | a^{T} {\nabla f |}^{2})) - 〈 a^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), a^{T} \nabla f 〉_{R^{n}} \\ = & X^{T} Q^{T} Q X + 2 D^{T} Q X + 2 C^{T} X + D^{T} D \\ + \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{n}} \\ + \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{n}} \end{matrix}

\begin{matrix} - \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 (a_{k \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{n}} \\ - \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{k \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}}, {(a^{T} \nabla)}_{k} f 〉_{R^{n}} . \end{matrix}

(54)

Recall that matrix Q and vectors X, C, and D are defined in Notation 1.

Proof.

We expand the two terms in Lemma 16. The first term reads as

\begin{matrix} \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | a^{T} {\nabla f |}^{2})) \\ = & \frac{1}{2} \sum_{i = 1}^{n} \sum_{k = 1}^{n} {(a^{T} \nabla)}_{i} {(a^{T} \nabla)}_{i} {| {(a^{T} \nabla)}_{k} f |}^{2} \\ = & \sum_{i = 1}^{n} \sum_{k = 1}^{n} {(a^{T} \nabla)}_{i} {〈 {(a^{T} \nabla)}_{i} {(a^{T} \nabla)}_{k} f, {(a^{T} \nabla)}_{k} f 〉}_{R^{n}} \\ = & \sum_{i = 1}^{n} \sum_{k = 1}^{n} {〈 {(a^{T} \nabla)}_{i} {(a^{T} \nabla)}_{k} f, {(a^{T} \nabla)}_{i} {(a^{T} \nabla)}_{k} f 〉}_{R^{n}} \dots T_{1} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 (a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}}) (a_{i \hat{i}}^{T} \frac{\partial}{\partial x_{\hat{i}}}) (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}}) f, {(a^{T} \nabla)}_{k} f 〉}_{R^{n}} \dots R_{1} . \end{matrix}

The second term reads as

\begin{matrix} {〈 a^{T} \nabla ([(a^{T} \nabla) \circ (a^{T} \nabla f)]), a^{T} \nabla f 〉}_{R^{n}} \\ = & \sum_{i, k = 1}^{n} 〈 {(a^{T} \nabla)}_{k} [{(a^{T} \nabla)}_{i} {(a^{T} \nabla)}_{i} f], {(a^{T} \nabla)}_{k} f 〉 \\ = & \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}}) [(a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}}) (a_{i \hat{i}}^{T} \frac{\partial}{\partial x_{\hat{i}}}) f], {(a^{T} \nabla)}_{k} f 〉 \dots R_{2} . \end{matrix}

Next, we expand

R_{1}

and

R_{2}

completely and obtain the following:

\begin{matrix} R_{1} & = & \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 (a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}}) (a_{i \hat{i}}^{T} \frac{\partial}{\partial x_{\hat{i}}}) (a_{k \hat{k}}^{T} \frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{n}} \\ = & \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{n}} \dots R_{1}^{1} \\ + \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{n}} \dots R_{1}^{2} \end{matrix}

\begin{matrix} + \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{n}} \dots R_{1}^{3} \\ + \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 (a_{i i^{'}}^{T}) ((\frac{\partial}{\partial x_{i^{'}}} a_{i \hat{i}}^{T}) a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{n}} \dots R_{1}^{4} \\ + \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} a_{k \hat{k}}^{T}) \frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} {f 〉}_{R^{n}} \dots R_{1}^{5} \\ + \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} a_{k \hat{k}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f 〉_{R^{n}} \dots R_{1}^{6} . \end{matrix}

Additionally,

\begin{matrix} R_{2} & = & \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}}) [(a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}}) (a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}})], {(a^{T} \nabla)}_{k} f 〉 \\ = & \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{k \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(a^{T} \nabla)}_{k} f 〉 \dots R_{2}^{1} \\ + \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{k \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}}, {(a^{T} \nabla)}_{k} f 〉 \dots R_{2}^{2} \\ + \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{k \hat{k}}^{T} a_{i i^{'}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(a^{T} \nabla)}_{k} f 〉 \dots R_{2}^{3} = R_{1}^{4} \\ + \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{k \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(a^{T} \nabla)}_{k} f 〉 \dots R_{2}^{4} \\ + \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{k \hat{k}}^{T} a_{i i^{'}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{\hat{k}}} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(a^{T} \nabla)}_{k} f 〉 \dots R_{2}^{5} \\ + \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{k \hat{k}}^{T} a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(a^{T} \nabla)}_{k} f 〉 \dots R_{2}^{6} = R_{1}^{6} . \end{matrix}

Our next step is to complete the squares for all the above terms. Look at the term

T_{1}

first.

\begin{matrix} T_{1} & = & \sum_{i, k = 1}^{n} 〈\sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} a_{k \hat{k}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{k}}} + \sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}, \\ \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} a_{k k^{'}}^{T} \frac{\partial^{2} f}{\partial x_{i^{'}} \partial x_{k^{'}}} + \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} \frac{\partial a_{k k^{'}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{k^{'}}}〉 \\ = & \sum_{i, k = 1}^{n} 〈\sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} a_{k \hat{k}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{k}}}, \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} a_{k k^{'}}^{T} \frac{\partial^{2} f}{\partial x_{i^{'}} \partial x_{k^{'}}}〉 \dots T_{1 a} \\ + \sum_{i, k = 1}^{n} 〈\sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} a_{k \hat{k}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{k}}}, \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} \frac{\partial a_{k k^{'}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{k^{'}}}〉 \dots T_{1 b} \\ + \sum_{i, k = 1}^{n} 〈\sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}, \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} a_{k k^{'}}^{T} \frac{\partial^{2} f}{\partial x_{i^{'}} \partial x_{k^{'}}}〉 \dots T_{1 c} \\ + \sum_{i, k = 1}^{n} 〈\sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}, \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} \frac{\partial a_{k k^{'}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{k^{'}}}〉 \dots T_{1 d} . \end{matrix}

The terms

T_{1 b} = T_{1 c}

,

R_{1}^{3} = R_{1}^{5}

, and

R_{2}^{5} = R_{2}^{4}

play the role of crossing terms inside the complete squares. In particular, for convenience, we change the index inside the sum of

R_{1}^{3}

and

R_{2}^{5}

, switching

i^{'}, \hat{i}

for

R_{1}^{3}

and switching

i^{'}, \hat{k}

for

R_{2}^{5}

. Then, we obtain the following.

\begin{matrix} 2 R_{1}^{3} & = & 2 \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈a_{i \hat{i}}^{T} a_{i i^{'}}^{T} (\frac{\partial a_{k \hat{k}}^{T}}{\partial x_{i^{'}}}) (\frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f〉 \\ = & 2 \sum_{i, k = 1}^{n} \sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{i^{'}, l = 1}^{n + m} (a_{i \hat{i}}^{T} a_{i i^{'}}^{T} (\frac{\partial a_{k \hat{k}}^{T}}{\partial x_{i^{'}}}) (\frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}) a_{k l}^{T} \frac{\partial f}{\partial x_{l}}) \\ - 2 R_{2}^{5} & = & - 2 \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈a_{k i^{'}}^{T} a_{i \hat{k}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(a^{T} \nabla)}_{k} f〉 \\ = & - 2 \sum_{i, k = 1}^{n} \sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{i^{'}, l = 1}^{n + m} (a_{k i^{'}}^{T} a_{i \hat{k}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}) a_{k l}^{T} \frac{\partial f}{\partial x_{l}}) . \end{matrix}

We denote

\begin{matrix} \sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} a_{k \hat{k}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{k}}} & = & γ_{i k} . \end{matrix}

(55)

The above Equality (55) can be represented in the following matrix form:

\begin{matrix} Q_{n^{2} \times {(n + m)}^{2}} X_{{(n + m)}^{2} \times 1} = {(γ_{11}, \dots, γ_{i k}, \dots, γ_{n n})}_{n^{2} \times 1}^{T}, \end{matrix}

where Q and X are defined in (12) and (19). Now, we can represent term

T_{1 a}

as

\sum_{i, k = 1}^{n} γ_{i k}^{2} = γ^{T} γ = {(Q X)}^{T} Q X = X^{T} Q^{T} Q X

. Next, we want to represent

R_{1}^{3}

and

R_{2}^{5}

in the following form in terms of vector X:

\begin{matrix} 2 R_{1}^{3} - 2 R_{2}^{5} \\ = & 2 \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} ({〈a_{i \hat{i}}^{T} a_{i i^{'}}^{T} (\frac{\partial a_{k \hat{k}}^{T}}{\partial x_{i^{'}}}) (\frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(a^{T} \nabla)}_{k} f〉}_{R^{n}} \\ - 2 \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈a_{k i^{'}}^{T} a_{i \hat{k}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(a^{T} \nabla)}_{k} f〉) \\ = & 2 \sum_{\hat{i}, \hat{k} = 1}^{n + m} [\sum_{i, k = 1}^{n} \sum_{i^{'} = 1}^{n + m} (〈 a_{i \hat{i}}^{T} a_{i i^{'}}^{T} (\frac{\partial a_{k \hat{k}}^{T}}{\partial x_{i^{'}}}), {(a^{T} \nabla)}_{k} f 〉 - 〈 a_{k i^{'}}^{T} a_{i \hat{k}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}, {(a^{T} \nabla)}_{k} f 〉)] (\frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}) \\ = & 2 C^{T} X, \end{matrix}

where C is defined in (14). Similarly, we can represent

T_{1 b} = T_{1 c}

by X:

\begin{matrix} T_{1 b} = T_{1 c} & = & \sum_{i, k = 1}^{n} 〈 \sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} \frac{\partial a_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}, \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} a_{k k^{'}}^{T} \frac{\partial^{2} f}{\partial x_{i^{'}} \partial x_{k^{'}}} 〉 \\ = & D^{T} Q X, \end{matrix}

where D is defined in (1). Summingover the above terms, we have the following quadratic form:

\begin{matrix} T_{1} + 2 R_{1}^{3} - 2 R_{2}^{5} = X^{T} Q^{T} Q X + 2 D^{T} Q X + 2 C^{T} X + D^{T} D . \end{matrix}

(56)

Taking into account the fact that

R_{1}^{6} - R_{2}^{6} = 0

and

R_{1}^{4} - R_{2}^{3} = 0

, we have

\begin{matrix} T_{1} + R_{1} - R_{2} = T_{1} + 2 R_{1}^{3} - 2 R_{2}^{5} + R_{1}^{1} + R_{1}^{2} - R_{2}^{1} - R_{2}^{2}, \end{matrix}

which completes the proof. □

5.3. Proof of Lemma 12

Lemma 17.

\begin{matrix} Γ_{2, \tilde{L}}^{z} (f, f) = X^{T} P^{T} P X + 2 E^{T} P X + 2 F^{T} X + E^{T} E + R_{z} (\nabla f, \nabla f) . \end{matrix}

(57)

where

R_{z}

is defined in Definition 2.

Proof.

The proof follows directly from Lemmas 18 and 19. □

Lemma 18.

\begin{matrix} \frac{1}{2} \tilde{L} Γ_{1}^{z} (f, f) - Γ_{1}^{z} (\tilde{L} f, f) \\ = & \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | z^{T} {\nabla f |}^{2})) - {〈 z^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), z^{T} \nabla f 〉}_{R^{m}} . \end{matrix}

Proof.

Step 1: We first define

Γ_{1}^{z} = {〈 z^{T} \nabla f, z^{T} \nabla f 〉}_{R^{m}}

, then we have

\begin{matrix} \tilde{L} Γ_{1}^{z} (f, f) = Δ_{p} Γ_{1}^{z} (f, f) - A \nabla Γ_{1}^{z} (f, f), Γ_{1}^{z} (\tilde{L} f, f) = Γ_{1}^{z} (Δ_{p} f, f) - Γ_{1}^{z} (A \nabla f, f) . \end{matrix}

By our definition above, we directly obtain

\begin{matrix} Δ_{a} Γ_{1}^{z} (f, f) & = & \nabla \cdot (a a^{T} \nabla {〈 z^{T} \nabla f, z^{T} \nabla f 〉}_{R^{m}}) = \nabla \cdot (a F^{z}) \\ = & \sum_{\hat{i} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{i}}} (\sum_{k = 1}^{n} a_{\hat{i} k} F_{k}^{z}) \\ = & \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} F_{k}^{z} + a_{\hat{i} k} \frac{\partial}{\partial x_{\hat{i}}} F_{k}^{z}) \\ = & \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} F_{k}^{z}) + a^{T} \nabla \circ (a^{T} \nabla {(z^{T} \nabla f)}^{2}), \end{matrix}

where we denote

\begin{matrix} F^{z} & = & a^{T} \nabla {〈 z^{T} \nabla f, z^{T} \nabla f 〉}_{R^{m}} = a^{T} \nabla \sum_{l = 1}^{m} {(\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)}^{2} \\ = & {(\sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} \sum_{l = 1}^{m} {(\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)}^{2})}_{k = 1, \dots, n} = {(F_{1}^{z}, F_{2}^{z}, \dots, F_{n}^{z})}^{T} . \end{matrix}

We have

\begin{matrix} Δ_{a} Γ_{1}^{z} (f, f) \\ = & \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} \sum_{l = 1}^{m} {(\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)}^{2})) + a^{T} \nabla \circ (a^{T} \nabla {(z^{T} \nabla f)}^{2})) \\ = & \sum_{k = 1}^{n} \sum_{\hat{i} = 1}^{n + m} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} {(z^{T} \nabla f)}^{2})) + (a^{T} \nabla) \circ (a^{T} \nabla {(z^{T} \nabla f)}^{2}) \\ = & \nabla a \circ (a^{T} \nabla {(z^{T} \nabla f)}^{2}) + (a^{T} \nabla) \circ (a^{T} \nabla {(z^{T} \nabla f)}^{2}) . \end{matrix}

(58)

Next, we compute the following quantity.

\begin{matrix} Γ_{1}^{z} (Δ_{a} f, f) & = & {〈 z^{T} \nabla (\nabla \cdot (a a^{T} \nabla f)), z^{T} \nabla f 〉}_{R^{m}} . \end{matrix}

From Lemma 15, we have

\begin{matrix} \nabla \cdot (a a^{T} \nabla f) = \nabla a \circ (a^{T} \nabla f) + (a^{T} \nabla) \circ (a^{T} \nabla f) . \end{matrix}

We continue with our computation as below:

\begin{matrix} Γ_{1}^{z} (Δ_{a} f, f) \\ = & {〈 z^{T} \nabla [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f) + (a^{T} \nabla) \circ (a^{T} \nabla f)], z^{T} \nabla f 〉}_{R^{m}} \\ = & {〈 z^{T} \nabla [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)], z^{T} \nabla f 〉}_{R^{m}} + {〈 z^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), z^{T} \nabla f 〉}_{R^{m}} \end{matrix}

(59)

\begin{matrix} = & {〈 z^{T} \nabla (\nabla a \cdot (a^{T} \nabla f)), z^{T} \nabla f 〉}_{R^{m}} \\ + {〈 z^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), z^{T} \nabla f 〉}_{R^{m}} \\ = & {〈 (z^{T} \nabla \nabla a \cdot (a^{T} \nabla f)), z^{T} \nabla f 〉}_{R^{m}} + {〈 (\nabla a \cdot (z^{T} \nabla (a^{T} \nabla f))), z^{T} \nabla f 〉}_{R^{m}} \\ + {〈 z^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), z^{T} \nabla f 〉}_{R^{m}} . \end{matrix}

(60)

From the above, combining (58) and (59), we further obtain

\begin{matrix} \frac{1}{2} Δ_{a} Γ_{1}^{z} (f, f) - Γ_{1}^{z} (Δ_{a} f, f) \\ = & \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | z^{T} {\nabla f |}^{2})) - {〈 z^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), z^{T} \nabla f 〉}_{R^{m}} \\ + \frac{1}{2} \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{j}}} \sum_{l = 1}^{n} {(\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)}^{2})) \end{matrix}

\begin{matrix} - {〈 z^{T} \nabla [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)], z^{T} \nabla f 〉}_{R^{m}} \\ = & \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | z^{T} {\nabla f |}^{2})) - {〈 z^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), z^{T} \nabla f 〉}_{R^{m}} \\ + \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \sum_{l = 1}^{m} (\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) \frac{\partial}{\partial x_{\hat{k}}} (\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)) \dots I \\ - \sum_{l = 1}^{m} ((\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} z_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \dots II . \end{matrix}

Recall here that we denote

a^{T}

to emphasize the transpose of the matrix a and

a_{i \hat{i}}^{T} = a_{\hat{i} i}

:

\begin{matrix} I & = & \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \sum_{l = 1}^{m} (\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) \frac{\partial}{\partial x_{\hat{k}}} (\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)) \\ = & \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \sum_{l = 1}^{m} (\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{\hat{l} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{k}}} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f)) \\ + \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} \sum_{l = 1}^{m} (\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{k}}} \frac{\partial}{\partial x_{\hat{l}}} f)) \\ = & \sum_{l = 1}^{m} (\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} (\sum_{l^{'} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{k}}} z_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} f)) \\ + \sum_{l = 1}^{m} (\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} (\sum_{\hat{l} = 1}^{n + m} z_{l l^{'}}^{T} \frac{\partial}{\partial x_{\hat{k}}} \frac{\partial}{\partial x_{l^{'}}} f)); \\ II & = & \sum_{l = 1}^{m} ((\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} z_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ = & \sum_{l = 1}^{m} ((\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} z_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \frac{\partial}{\partial x_{l^{'}}} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ + \sum_{l = 1}^{m} ((\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} z_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial^{2}}{\partial x_{\hat{i}} x_{l^{'}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ = & \sum_{l = 1}^{m} ((\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} z_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\frac{\partial}{\partial x_{l^{'}}} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ + \sum_{l = 1}^{m} ((\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} z_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{l^{'}}} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ + \sum_{l = 1}^{n} ((\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} z_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial^{2}}{\partial x_{\hat{i}} x_{l^{'}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) . \end{matrix}

Subtracting the above two terms, we obtain the following:

\begin{matrix} I - II & = & - \sum_{l = 1}^{m} ((\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} z_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\frac{\partial}{\partial x_{l^{'}}} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ - \sum_{l = 1}^{m} ((\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} z_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial^{2}}{\partial x_{\hat{i}} x_{l^{'}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ + \sum_{l = 1}^{m} (\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} (\sum_{l^{'} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{k}}} z_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} f)) . \end{matrix}

Now, we eventually end up with the following formula:

\begin{matrix} \frac{1}{2} Δ_{a} Γ_{1}^{z} (f, f) - Γ_{1}^{z} (Δ_{a} f, f) \\ = & \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | z^{T} {\nabla f |}^{2})) - {〈 z^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), z^{T} \nabla f 〉}_{R^{m}} \\ - \sum_{l = 1}^{m} ((\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} z_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} (\frac{\partial}{\partial x_{l^{'}}} a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ - \sum_{l = 1}^{m} ((\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) (\sum_{l^{'} = 1}^{n + m} z_{l l^{'}}^{T} [\sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \frac{\partial^{2}}{\partial x_{\hat{i}} x_{l^{'}}} a_{\hat{i} k} (a_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}} f)])) \\ + \sum_{l = 1}^{m} (\sum_{\hat{l} = 1}^{n + m} z_{l \hat{l}}^{T} \frac{\partial}{\partial x_{\hat{l}}} f) \sum_{\hat{i} = 1}^{n + m} \sum_{k = 1}^{n} (\frac{\partial}{\partial x_{\hat{i}}} a_{\hat{i} k} \sum_{\hat{k} = 1}^{n + m} a_{k \hat{k}}^{T} (\sum_{l^{'} = 1}^{n + m} \frac{\partial}{\partial x_{\hat{k}}} z_{l l^{'}}^{T} \frac{\partial}{\partial x_{l^{'}}} f)) . \end{matrix}

Step 2: Computation of

- \frac{1}{2} A \nabla Γ_{1}^{z} (f, f) + Γ_{1}^{z} (A \nabla f, f)

. Now, we compute the last two terms of the above equation, with

A = a \otimes \nabla a

:

\begin{matrix} - \frac{1}{2} A \nabla Γ_{1}^{z} (f, f) & = & - \frac{1}{2} \sum_{\hat{k} = 1}^{n + m} A_{\hat{k}} \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} {〈 z^{T} \nabla f, z^{T} \nabla f 〉}_{R^{m}} \\ = & - \sum_{\hat{k} = 1}^{n + m} {〈 A_{\hat{k}} (\nabla_{\frac{\partial}{\partial x_{\hat{k}}}} z^{T}) \nabla f, z^{T} \nabla f 〉}_{R^{m}} - \sum_{\hat{k} = 1}^{n + m} {〈 A_{\hat{k}} z^{T} (\nabla_{\frac{\partial}{\partial x_{\hat{k}}}} \nabla f), z^{T} \nabla f 〉}_{R^{m}} \\ = & {\tilde{J}}_{1} + {\tilde{J}}_{2}, \end{matrix}

\begin{matrix} Γ_{1}^{z} (A \nabla f, f) & = & {〈 z^{T} \nabla (\sum_{\hat{k} = 1}^{n + m} A_{\hat{k}} \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} f), z^{T} \nabla f 〉}_{R^{m}} \\ = & {〈 z^{T} (\sum_{\hat{k} = 1}^{n + m} A_{\hat{k}} \nabla \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} f), z^{T} \nabla f 〉}_{R^{m}} + {〈 z^{T} (\sum_{\hat{k} = 1}^{n + m} \nabla A_{\hat{k}} \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} f), z^{T} \nabla f 〉}_{R^{m}} \\ = & {\tilde{J}}_{3} + {\tilde{J}}_{4} . \end{matrix}

It is easy to see that

{\tilde{J}}_{2} + {\tilde{J}}_{3} = 0 .

We now expand

{\tilde{J}}_{1}

and

{\tilde{J}}_{4}

into local coordinates:

\begin{matrix} {\tilde{J}}_{1} & = & - \sum_{l = 1}^{m} {(z^{T} \nabla f)}_{l} (\sum_{l^{'}, \hat{k} = 1}^{n + m} \sum_{k = 1}^{n} \sum_{k^{'} = 1}^{n + m} a_{\hat{k} k} \nabla_{\frac{\partial}{\partial x_{k^{'}}}} a_{k^{'} k} \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} z_{l l^{'}} \nabla_{\frac{\partial}{\partial x_{l^{'}}}} f), \end{matrix}

(61)

\begin{matrix} {\tilde{J}}_{4} & = & \sum_{l = 1}^{m} {(z^{T} \nabla f)}_{l} (\sum_{l^{'} = 1}^{n + m} z_{l l^{'}}^{T} (\sum_{\hat{k} = 1}^{n + m} \nabla_{\frac{\partial}{\partial x_{l^{'}}}} (\sum_{k = 1}^{n} \sum_{k^{'} = 1}^{n + m} a_{\hat{k} k} \nabla_{\frac{\partial}{\partial x_{k^{'}}}} a_{k^{'} k}) \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} f)) \\ = & \sum_{l = 1}^{m} {(z^{T} \nabla f)}_{l} (\sum_{k = 1}^{n} \sum_{l^{'} = 1}^{n + m} \sum_{\hat{k}, k^{'} = 1}^{n + m} z_{l l^{'}}^{T} \nabla_{\frac{\partial}{\partial x_{l^{'}}}} a_{\hat{k} k} \nabla_{\frac{\partial}{\partial x_{k^{'}}}} a_{k^{'} k} \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} f) \\ + \sum_{l = 1}^{m} {(z^{T} \nabla f)}_{l} (\sum_{k = 1}^{n} \sum_{l^{'} = 1}^{n + m} \sum_{\hat{k}, k^{'} = 1}^{n + m} z_{l l^{'}}^{T} a_{\hat{k} k} (\nabla_{\frac{\partial}{\partial x_{l^{'}}}} \nabla_{\frac{\partial}{\partial x_{k^{'}}}} a_{k^{'} k}) \nabla_{\frac{\partial}{\partial x_{\hat{k}}}} f) . \end{matrix}

(62)

Combining the above two steps, we thus obtain

\begin{matrix} \frac{1}{2} \tilde{L} Γ_{1}^{z} (f, f) - Γ_{1}^{z} (\tilde{L} f, f) = \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | z^{T} {\nabla f |}^{2})) - {〈 z^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), z^{T} \nabla f 〉}_{R^{m}} . \end{matrix}

□

Lemma 19.

\begin{matrix} \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | z^{T} {\nabla f |}^{2})) - {〈 z^{T} \nabla ((a^{T} \nabla) \circ (a^{T} \nabla f)), z^{T} \nabla f 〉}_{R^{m}} \\ = & X^{T} P^{T} P X + 2 E^{T} P X + 2 F^{T} X + E^{T} E + R_{z} (\nabla f, \nabla f) . \end{matrix}

Proof.

We expand the two terms in Lemma 19.

\begin{matrix} \frac{1}{2} (a^{T} \nabla \circ (a^{T} \nabla | z^{T} {\nabla f |}^{2})) \\ = & \frac{1}{2} \sum_{i = 1}^{n} \sum_{k = 1}^{m} {(a^{T} \nabla)}_{i} {(a^{T} \nabla)}_{i} {| {(z^{T} \nabla)}_{k} f |}^{2} \\ = & \sum_{i = 1}^{n} \sum_{k = 1}^{m} {(a^{T} \nabla)}_{i} {〈 {(a^{T} \nabla)}_{i} {(z^{T} \nabla)}_{k} f, {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \\ = & \sum_{i = 1}^{n} \sum_{k = 1}^{m} {〈 {(a^{T} \nabla)}_{i} {(z^{T} \nabla)}_{k} f, {(a^{T} \nabla)}_{i} {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \dots {\tilde{T}}_{1} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{m} {〈 (a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}}) (a_{i \hat{i}}^{T} \frac{\partial}{\partial x_{\hat{i}}}) (z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}}) f, {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \dots {\tilde{R}}_{1} . \end{matrix}

\begin{matrix} {〈 z^{T} \nabla ([(a^{T} \nabla) \circ (a^{T} \nabla f)]), z^{T} \nabla f 〉}_{R^{m}} \\ = & \sum_{i = 1}^{n} \sum_{k = 1}^{m} {〈 {(z^{T} \nabla)}_{k} [{(a^{T} \nabla)}_{i} {(a^{T} \nabla)}_{i} f], {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \\ = & \sum_{i = 1}^{n} \sum_{k = 1}^{m} {〈 (z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}}) [(a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}}) (a_{i \hat{i}}^{T} \frac{\partial}{\partial x_{\hat{i}}}) f], {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \dots {\tilde{R}}_{2} . \end{matrix}

Next, we expand

{\tilde{R}}_{1}

and

{\tilde{R}}_{2}

completely and obtain the following:

\begin{matrix} {\tilde{R}}_{1} & = & \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 (a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}}) (a_{i \hat{i}}^{T} \frac{\partial}{\partial x_{\hat{i}}}) (z_{k \hat{k}}^{T} \frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \\ = & \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 a_{i i^{'}}^{T} (\frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial z_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \dots {\tilde{R}}_{1}^{1} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial z_{k \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \dots {\tilde{R}}_{1}^{2} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial z_{k \hat{k}}^{T}}{\partial x_{\hat{i}}}) (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \dots {\tilde{R}}_{1}^{3} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 (a_{i i^{'}}^{T}) ((\frac{\partial}{\partial x_{i^{'}}} a_{i \hat{i}}^{T}) z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \dots {\tilde{R}}_{1}^{4} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} z_{k \hat{k}}^{T}) \frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{k} {f 〉}_{R^{m}} \dots {\tilde{R}}_{1}^{5} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 a_{i i^{'}}^{T} a_{i \hat{i}}^{T} z_{k \hat{k}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \dots {\tilde{R}}_{1}^{6} \end{matrix}

\begin{matrix} {\tilde{R}}_{2} & = & \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 (z_{k \hat{k}}^{T} \frac{\partial}{\partial x_{\hat{k}}}) [(a_{i i^{'}}^{T} \frac{\partial}{\partial x_{i^{'}}}) (a_{i \hat{i}}^{T} \frac{\partial f}{\partial x_{\hat{i}}})], {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \\ = & \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 z_{k \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(z^{T} \nabla)}_{k} {f 〉}_{R^{m}} \dots {\tilde{R}}_{2}^{1} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 z_{k \hat{k}}^{T} a_{i i^{'}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}) \frac{\partial f}{\partial x_{\hat{i}}}, {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \dots {\tilde{R}}_{2}^{2} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 z_{k \hat{k}}^{T} a_{i i^{'}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \dots {\tilde{R}}_{2}^{3} = {\tilde{R}}_{1}^{4} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 z_{k \hat{k}}^{T} \frac{\partial a_{i i^{'}}^{T}}{\partial x_{\hat{k}}} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \dots {\tilde{R}}_{2}^{4} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 z_{k \hat{k}}^{T} a_{i i^{'}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{\hat{k}}} (\frac{\partial}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \dots {\tilde{R}}_{2}^{5} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 z_{k \hat{k}}^{T} a_{i i^{'}}^{T} a_{i \hat{i}}^{T} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{m}} \dots {\tilde{R}}_{2}^{6} = {\tilde{R}}_{1}^{6} \end{matrix}

Our next step is to complete the squares for all the above terms. We look at term

{\tilde{T}}_{1}

first.

\begin{matrix} {\tilde{T}}_{1} & = & \sum_{i = 1}^{n} \sum_{k = 1}^{m} 〈\sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} z_{k \hat{k}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{k}}} + \sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} \frac{\partial z_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}, \\ \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} z_{k k^{'}}^{T} \frac{\partial^{2} f}{\partial x_{i^{'}} \partial x_{k^{'}}} + \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} \frac{\partial z_{k k^{'}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{k^{'}}}〉 \\ = & \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{k = 1}^{m} 〈\sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} z_{k \hat{k}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{k}}}, \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} z_{k k^{'}}^{T} \frac{\partial^{2} f}{\partial x_{i^{'}} \partial x_{k^{'}}}〉 \dots {\tilde{T}}_{1 a} \end{matrix}

\begin{matrix} + \sum_{i = 1}^{n} \sum_{k = 1}^{m} 〈\sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} z_{k \hat{k}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{k}}}, \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} \frac{\partial z_{k k^{'}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{k^{'}}}〉 \dots {\tilde{T}}_{1 b} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{m} 〈\sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} \frac{\partial z_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}, \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} z_{k k^{'}}^{T} \frac{\partial^{2} f}{\partial x_{i^{'}} \partial x_{k^{'}}}〉 \dots {\tilde{T}}_{1 c} \\ + \sum_{i = 1}^{n} \sum_{k = 1}^{m} 〈\sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} \frac{\partial z_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}, \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} \frac{\partial z_{k k^{'}}^{T}}{\partial x_{i^{'}}} \frac{\partial f}{\partial x_{k^{'}}}〉 \dots {\tilde{T}}_{1 d} . \end{matrix}

The terms

{\tilde{T}}_{1 b} = {\tilde{T}}_{1 c}

,

{\tilde{R}}_{1}^{3} = {\tilde{R}}_{1}^{5}

, and

{\tilde{R}}_{2}^{5} = {\tilde{R}}_{2}^{4}

play the role of crossing terms inside the complete squares. In particular, for convenience, we changed the index inside the sum of

{\tilde{R}}_{1}^{3}

and

{\tilde{R}}_{2}^{5}

, switched

i^{'}, \hat{i}

for

{\tilde{R}}_{1}^{3}

, and switched

i^{'}, \hat{k}

for

{\tilde{R}}_{2}^{5}

, then we obtain the following.

\begin{matrix} 2 {\tilde{R}}_{1}^{3} & = & 2 \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 a_{i \hat{i}}^{T} a_{i i^{'}}^{T} (\frac{\partial z_{k \hat{k}}^{T}}{\partial x_{i^{'}}}) (\frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{n}} \\ = & 2 \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{i^{'}, l = 1}^{n + m} (a_{i \hat{i}}^{T} a_{i i^{'}}^{T} (\frac{\partial z_{k \hat{k}}^{T}}{\partial x_{i^{'}}}) (\frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}) z_{k l}^{T} \frac{\partial f}{\partial x_{l}}) \\ - 2 {\tilde{R}}_{2}^{5} & = & - 2 \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 z_{k i^{'}}^{T} a_{i \hat{k}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(z^{T} \nabla)}_{k} f 〉 \\ = & - 2 \sum_{i = 1}^{n} \sum_{k = 1}^{m} \sum_{\hat{i}, \hat{k} = 1}^{n + m} \sum_{i^{'}, l = 1}^{n + m} (z_{k i^{'}}^{T} a_{i \hat{k}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}) z_{k l}^{T} \frac{\partial f}{\partial x_{l}}) \end{matrix}

We denote

\begin{matrix} \sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} z_{k \hat{k}}^{T} \frac{\partial^{2} f}{\partial x_{\hat{i}} \partial x_{\hat{k}}} & = & ω_{i k} . \end{matrix}

(63)

The above Equality (63) can be represented in the following matrix form:

\begin{matrix} P_{(n * m) \times {(n + m)}^{2}} X_{{(n + m)}^{2} \times 1} = {(ω_{11}, \dots, ω_{i k}, \dots, ω_{n m})}_{(n * m) \times 1}^{T} \end{matrix}

where P and X are defined in (13) and (19). Now, we can represent term

{\tilde{T}}_{1 a}

as

\sum_{i = 1}^{n} \sum_{k = 1}^{m} ω_{i k}^{2} = ω^{T} ω = {(P X)}^{T} P X = X^{T} P^{T} P X

. Next, we want to represent

{\tilde{R}}_{1}^{3}

and

{\tilde{R}}_{2}^{5}

in the following form in terms of vector X:

\begin{matrix} 2 {\tilde{R}}_{1}^{3} - 2 {\tilde{R}}_{2}^{5} \\ = & 2 \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} {〈 a_{i \hat{i}}^{T} a_{i i^{'}}^{T} (\frac{\partial z_{k \hat{k}}^{T}}{\partial x_{i^{'}}}) (\frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}), {(z^{T} \nabla)}_{k} f 〉}_{R^{n}} \\ - 2 \sum_{i, k = 1}^{n} \sum_{i^{'}, \hat{i}, \hat{k} = 1}^{n + m} 〈 z_{k i^{'}}^{T} a_{i \hat{k}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}} (\frac{\partial}{\partial x_{\hat{k}}} \frac{\partial f}{\partial x_{\hat{i}}}), {(z^{T} \nabla)}_{k} f 〉 \\ = & 2 \sum_{\hat{i}, \hat{k} = 1}^{n + m} [\sum_{i, k = 1}^{n} \sum_{i^{'} = 1}^{n + m} (〈 a_{i \hat{i}}^{T} a_{i i^{'}}^{T} (\frac{\partial z_{k \hat{k}}^{T}}{\partial x_{i^{'}}}), {(z^{T} \nabla)}_{k} f 〉 - 〈 z_{k i^{'}}^{T} a_{i \hat{k}}^{T} \frac{\partial a_{i \hat{i}}^{T}}{\partial x_{i^{'}}}, {(z^{T} \nabla)}_{k} f 〉)] (\frac{\partial}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}) \\ = & 2 F^{T} X, \end{matrix}

where F is defined in (16). Similarly, we can represent

{\tilde{T}}_{1 b} = {\tilde{T}}_{1 c}

by X:

\begin{matrix} {\tilde{T}}_{1 b} = {\tilde{T}}_{1 c} & = & \sum_{i, k = 1}^{n} 〈 \sum_{\hat{i}, \hat{k} = 1}^{n + m} a_{i \hat{i}}^{T} \frac{\partial z_{k \hat{k}}^{T}}{\partial x_{\hat{i}}} \frac{\partial f}{\partial x_{\hat{k}}}, \sum_{i^{'}, k^{'} = 1}^{n + m} a_{i i^{'}}^{T} z_{k k^{'}}^{T} \frac{\partial^{2} f}{\partial x_{i^{'}} \partial x_{k^{'}}} 〉 = E^{T} P X \end{matrix}

where E is defined in (17). We thus have the following form:

\begin{matrix} {\tilde{T}}_{1} + 2 {\tilde{R}}_{1}^{3} - 2 {\tilde{R}}_{2}^{5} = X^{T} P^{T} P X + 2 E^{T} P X + 2 F^{T} X + E^{T} E \end{matrix}

Taking into account the fact that

R_{1}^{6} - R_{2}^{6} = 0

and

R_{1}^{4} - R_{2}^{3} = 0

, we have

\begin{matrix} {\tilde{T}}_{1} + {\tilde{R}}_{1} - {\tilde{R}}_{2} = {\tilde{T}}_{1} + 2 {\tilde{R}}_{1}^{3} - 2 {\tilde{R}}_{2}^{5} + {\tilde{R}}_{1}^{1} + {\tilde{R}}_{1}^{2} - {\tilde{R}}_{2}^{1} - {\tilde{R}}_{2}^{2}, \end{matrix}

which completes the proof. □

6. Further Discussions on Other Inequalities

In this section, we apply the generalized Gamma calculus to study the entropic inequality for the semi-group

P_{t}

associated with the drift–diffusion process. With a little abuse of notation, we denote the generator of the semi-group

P_{t}

as

\frac{1}{2} L

instead of L, and we denote

X_{t}

as the corresponding diffusion process.

Definition 5.

We define the semigroup

P_{t} = e^{\frac{1}{2} t L}

, where L is invariant with respect to the invariant measure

d μ = π (x) d x

. We denote

P_{t} f (x) = E (f (X_{t}))

, and

\begin{matrix} E (f (X_{t})) & = & \int_{R^{n + m}} f (y) p (t, x, y) d μ (y) = \int_{R^{n + m}} f (y) ρ (t, x, y) d y, \end{matrix}

where the infinitesimal generator of this process

X_{t}

is

\frac{1}{2} L

, and we denote

ρ (t, \cdot, \cdot)

as the product of the transition kernel

p (t, \cdot, \cdot)

and the volume measure π.

Remark 14.

Following the standard treatment as in [2] (Section 5), whenever we consider the differentiating operation on

P_{t} f

, we shall always consider

P_{t} f_{ε}

first with

f_{ε} = f + ε

, for

\forall ε > 0

. Then, we take the limit as

ε \to 0

. Throughout this section, we directly use

P_{t} f

instead of

P_{t} f_{ε}

for convenience.

Remark 15.

In the standard sub-Riemannian setting, the semi-groups are in general defined with respect to the invariant measure

d μ (y)

. In this paper, we formulate the semi-group and the transition kernel with respect to the Lebesgue measure

d y

.

Following the framework in [2], we also need the following assumption, which is necessary to rigorously justify the computations on functionals of the heat semigroup.

Assumption 2.

The semigroup

P_{t}

is stochastically complete, that is, for

t \geq 0

,

P_{t} 1 = 1

and for any

T > 0

and

f \in C^{\infty} (R^{n + m})

with compact support, we assume that

\begin{matrix} sup_{t \in [0, T]} ∥ Γ (P_{t} f) ∥_{\infty} + {∥ Γ_{1}^{z} (P_{t} f) ∥}_{\infty} < + \infty . \end{matrix}

(64)

We believe that the above Assumption 2 should follow from the the assumption

R \geq κ (Γ_{1} + Γ_{1}^{z})

if we assume the appropriate lower bound

κ

. We leave this for further studies. Related gradient estimates are presented in order below. For the infinitesimal generator

\frac{1}{2} L

associated with linear semi-group

P_{t}

, we have the following property.

Proposition 15.

For all smooth function f, we have:

$P_{0} = I d$ ;
For all functions $f \in C_{b} (R^{n + m})$ , the map $t \mapsto P_{t} f$ is continuous from $R^{+}$ to $L^{2} (d μ)$ ;
For all $s, t \geq 0,$ one has $P_{t} \circ P_{s} = P_{t + s}$ ;
$\forall x \in R^{n + m}$ , $\forall t \geq 0,$ $\frac{\partial}{\partial_{t}} P_{t} f (x) = \frac{1}{2} L (P_{t} f) (x) = \frac{1}{2} P_{t} (L f) (x)$ .

Next, we present the entropic inequality under Assumption 1. We follow closely the framework introduced in [2] and define the following two functionals:

\begin{matrix} ϕ_{a} (x, t) & = P_{T - t} f Γ_{1} (log P_{T - t} f) (x), and ϕ_{z} (x, t) & = P_{T - t} f Γ_{1}^{z} (log P_{T - t} f) (x) . \end{matrix}

Lemma 20.

We have the following relation:

\begin{matrix} \frac{1}{2} L ϕ_{a} + \frac{\partial}{\partial t} ϕ_{a} & = & (P_{T - t} f) (x) Γ_{2} (log P_{T - t} f, log P_{T - t} f) (x), \\ \frac{1}{2} L ϕ_{z} + \frac{\partial}{\partial t} ϕ_{z} & = & (P_{T - t} f) (x) Γ_{2}^{z} (log P_{T - t} f, log P_{T - t} f) (x) \\ + (P_{T - t} f) (x) Γ_{1} (log P_{T - t} f, Γ_{1}^{z} (P_{T - t} f, P_{T - t} f)) (x) \end{matrix}

(65)

\begin{matrix} - (P_{T - t} f) (x) Γ_{1}^{z} (log P_{T - t} f, Γ_{1} (P_{T - t} f, P_{T - t} f)) (x) . \end{matrix}

(66)

Proof.

Denote

g (t, x) = P_{T - t} f (x) = \int ρ (t, x, \tilde{x}) f (\tilde{x}) d \tilde{x}

, and we have the following relation:

\begin{matrix} L (log g) = - \frac{Γ_{1} (g, g)}{{(g)}^{2}} - 2 \frac{\partial_{t} g}{g} . \end{matrix}

By direct computation, one obtains

\begin{matrix} \partial_{t} ϕ_{a} & = & \partial_{t} g Γ_{1} (log g, log g) + 2 g {〈 a^{T} \nabla log g, a^{T} \nabla (\frac{\partial_{t} g}{g}) 〉}_{R^{n}} \\ = & - \frac{1}{2} L g Γ_{1} (log g, log g) - g Γ_{1} (log g, L log g) - g Γ_{1} (log g, Γ_{1} (log g, log g)), \\ \frac{1}{2} L ϕ_{a} & = & \frac{1}{2} L g Γ_{1} (log g, log g) + \frac{1}{2} g L Γ_{1} (log g, log g) + Γ_{1} (g, Γ_{1} (log g, log g)), \end{matrix}

where we have

Γ_{1} (g, Γ_{1} (log g, log g)) = g Γ_{1} (log g, Γ_{1} (log g, log g))

; thus, (66) is proven. Similarly, we obtain the following for

ϕ_{z}

:

\begin{matrix} \partial_{t} ϕ_{z} & = & \partial_{t} g Γ_{1}^{z} (log g, log g) + 2 g 〈 z^{T} \nabla log g, z^{T} \nabla (\frac{\partial_{t} g}{g}) 〉_{R^{m}} \\ = & - \frac{1}{2} L g Γ_{1}^{z} (log g, log g) - g Γ_{1}^{z} (log g, L log g) - g Γ_{1}^{z} (log g, Γ_{1} (log g, log g)), \\ \frac{1}{2} L ϕ_{z} & = & \frac{1}{2} L g Γ_{1}^{z} (log g, log g) + \frac{1}{2} g L Γ_{1}^{z} (log g, log g) + Γ_{1} (g, Γ_{1}^{z} (log g, log g)) . \end{matrix}

The proof then follows. □

Now, we are ready to present the following important lemma, which prepares us to prove the new entropy inequality without the assumption:

Γ_{1} (log P_{T - t} f, Γ_{1}^{z} (P_{T - t} f, P_{T - t} f)) (x) = Γ_{1}^{z} (log P_{T - t} f, Γ_{1} (P_{T - t} f, P_{T - t} f)) (x) .

Lemma 21.

For any

0 < s < T

, we denote

ρ (s, x, y) = p (s, x, y) π (y)

as the transition kernel of diffusion process

X_{s}^{x}

starting at x defined in Definition 5, and the following equality is satisfied:

\begin{matrix} E [g Γ_{1} (log g, Γ_{1}^{z} (log g, log g)) - g Γ_{1}^{z} (log g, Γ_{1} (log g, log g))] \\ = & \int \frac{\nabla \cdot (ρ (s, x, y) z z^{T} Γ_{\nabla (a a^{T})} (log g (s, y), log g (s, y)))}{ρ (s, x, y)} g (s, y) ρ (s, x, y) d y \\ - \int \frac{\nabla \cdot (ρ (s, x, y) a a^{T} Γ_{\nabla (z z^{T})} (log g (s, y), log g (s, y)))}{ρ (s, x, y)} g (s, y) ρ (s, x, y) d y . \end{matrix}

Here, we denote

g (s, y) = P_{T - s} f (y) = \int ρ (s, y, \tilde{y}) f (\tilde{y}) d \tilde{y}

and

\begin{matrix} E [g Γ_{1} (log g, Γ_{1}^{z} (log g, log g))] \\ = & E [g (s, X_{s}) Γ_{1} (log g (s, X_{s}^{x}), Γ_{1}^{z} (log g (s, X_{s}^{x}), log g (s, X_{s}^{x})))] \\ = & \int g (s, y) Γ_{1} (log g (s, y), Γ_{1}^{z} (log g (s, y), log g (s, y))) ρ (s, x, y) d y . \end{matrix}

Proof.

We first expand in the following integral form.

\begin{matrix} E [g Γ_{1} (log g, Γ_{1}^{z} (log g, log g)) - g Γ_{1}^{z} (log g, Γ_{1} (log g, log g))] \\ = & \int g (s, y) Γ_{1} (log g (s, y), Γ_{1}^{z} (log g (s, y), log g (s, y))) ρ (s, x, y) d y \\ - \int g (s, y) Γ_{1}^{z} (log g (s, y), Γ_{1} (log g (s, y), log g (s, y))) ρ (s, x, y) d y . \end{matrix}

We skip

x, y, s

for simplicity. Take

log g = h

.

Claim 1:

\begin{matrix} \int Γ_{1} (h, Γ_{1}^{z} (h, h)) ρ g d y - \int Γ_{1}^{z} (h, Γ_{1} (h, h)) ρ g d y \\ = & \int Γ_{1}^{z} (h, Δ_{a} h) ρ g d y - \int Γ_{1}^{z} (h, \frac{Δ_{a} g}{g}) ρ g d y - \int Γ_{1} (h, Δ_{z} h) ρ g d y + \int Γ_{1} (h, \frac{Δ_{z} g}{g}) ρ g d y . \end{matrix}

Recall that we denote

Δ_{a} = \nabla \cdot (a a^{T} \nabla)

and

Δ_{z} = \nabla \cdot (z z^{T} \nabla)

. Use the following identity:

\begin{matrix} Δ_{a} h = \frac{Δ_{a} g}{g} - \frac{Γ_{1} (g, g)}{g^{2}}, and Δ_{z} h = \frac{Δ_{z} g}{g} - \frac{Γ_{1}^{z} (g, g)}{g^{2}} . \end{matrix}

We then obtain

\begin{matrix} \int Γ_{1}^{z} (h, Δ_{a} h) ρ g d y & = & \int Γ_{1}^{z} (h, \frac{Δ_{a} g}{g} - \frac{Γ_{1} (g, g)}{g^{2}}) ρ g d y \\ = & - \int Γ_{1}^{z} (h, Γ_{1} (h, h)) ρ g d y + \int Γ_{1}^{z} (h, \frac{Δ_{a} g}{g}) ρ g d y . \end{matrix}

Similarly, the other equality is satisfied.

Claim 2:

\begin{matrix} \int Γ_{1}^{z} (h, Δ_{a} h) ρ g d y - \int Γ_{1}^{z} (h, \frac{Δ_{a} g}{g}) ρ g d y - \int Γ_{1} (h, Δ_{z} h) ρ g d y + \int Γ_{1} (h, \frac{Δ_{z} g}{g}) ρ g d y \\ = & \int \frac{\nabla \cdot (ρ z z^{T} Γ_{\nabla (a a^{T})} (h, h))}{ρ} g ρ d y - \int \frac{\nabla \cdot (ρ a a^{T} Γ_{\nabla (z z^{T})} (h, h))}{ρ} g ρ d y . \end{matrix}

First, observe that

\begin{matrix} \int Γ_{1}^{z} (h, \frac{Δ_{a} g}{g}) ρ g d y & = & \int 〈 z z^{T} \nabla h, \nabla (\frac{Δ_{a} g}{g}) 〉 ρ g d y = - \int \nabla \cdot (ρ z z^{T} \nabla g) \frac{Δ_{a} g}{g} d y \\ = & - \int \frac{ρ}{g} Δ_{a} g Δ_{z} g d y - \int 〈 \nabla ρ, z z^{T} \nabla g 〉 \frac{Δ_{a} g}{g} d y . \end{matrix}

Similarly, one obtains

\begin{matrix} \int Γ_{1} (h, \frac{Δ_{z} g}{g}) ρ g d y & = & - \int \frac{ρ}{g} Δ_{a} g Δ_{z} g d y - \int 〈 \nabla ρ, a a^{T} \nabla g 〉 \frac{Δ_{z} g}{g} d y . \end{matrix}

For the next term, one obtains

\begin{matrix} \int Γ_{1}^{z} (h, Δ_{a} h) ρ g d y \\ = & \int 〈 \nabla (\nabla \cdot (a a^{T} \nabla h)), z z^{T} \nabla h 〉 ρ g d y \\ = & - \int [\nabla \cdot (a a^{T} \nabla h)] [\nabla \cdot (ρ g z z^{T} \nabla h)] d y \\ = & - \int [\nabla \cdot (a a^{T} \frac{1}{g} \nabla g)] \nabla \cdot (ρ z z^{T} \nabla g) d y \\ = & - \int (〈 \nabla \frac{1}{g}, a a^{T} \nabla g 〉 + \frac{1}{g} Δ_{a} g) \nabla \cdot (ρ z z^{T} \nabla g) d y \\ = & \int 〈 \frac{1}{g^{2}} \nabla g, a a^{T} \nabla g 〉 (\nabla \cdot (ρ z z^{T} \nabla g)) d y - \int \frac{1}{g} Δ_{a} g (\nabla \cdot (ρ z z^{T} \nabla g)) d y \\ = & - 2 \int \nabla^{2} h (a a^{T} \nabla h, z z^{T} \nabla h) ρ g d y - \int 〈 〈 \nabla h, \nabla (a a^{T}) \nabla h 〉, z z^{T} \nabla h 〉 ρ g d y \\ - \int \frac{1}{g} Δ_{a} g 〈 \nabla ρ, z z^{T} \nabla g 〉 d y - \int \frac{ρ}{g} Δ_{a} g Δ_{z} g d y, \end{matrix}

where the last equality follows from the integration by parts for the first term and the direct expansion of the divergence for the second term. Similarly, we obtain

\begin{matrix} \int Γ_{1} (h, Δ_{z} h) ρ g d y \\ = & - 2 \int \nabla^{2} h (z z^{T} \nabla h, a a^{T} \nabla h) ρ g d y - \int 〈 〈 \nabla h, \nabla (z z^{T}) \nabla h 〉, a a^{T} \nabla h 〉 ρ g d y \\ - \int \frac{1}{g} Δ_{z} g 〈 \nabla ρ, a a^{T} \nabla g 〉 d y - \int \frac{ρ}{g} Δ_{a} g Δ_{z} g d y . \end{matrix}

Observing, by integration by parts, we obtain

\begin{matrix} - \int 〈 〈 \nabla h, \nabla (a a^{T}) \nabla h 〉, z z^{T} \nabla h 〉 ρ g d y + \int 〈 〈 \nabla h, \nabla (z z^{T}) \nabla h 〉, a a^{T} \nabla h 〉 ρ g d y \\ = & \int \frac{\nabla \cdot (ρ z z^{T} Γ_{\nabla (a a^{T})} (h, h))}{ρ} g ρ d y - \int \frac{\nabla \cdot (ρ a a^{T} Γ_{\nabla (z z^{T})} (h, h))}{ρ} g ρ d y . \end{matrix}

Combining the above formulas, the proof is completed. □

With the above lemma in hand, we are ready to prove the following entropic inequality. We first define the following energy form:

\begin{matrix} Φ_{a} (x, t) & = P_{t} (P_{T - t} f Γ_{1} (log P_{T - t} f)) (x), Φ_{z} (x, t) & = P_{t} (P_{T - t} f Γ_{1}^{z} (log P_{T - t} f)) (x) . \end{matrix}

Recall that we define

\begin{matrix} ϕ_{a} (x, t) & = P_{T - t} f Γ_{1} (log P_{T - t} f) (x), and ϕ_{z} (x, t) & = P_{T - t} f Γ_{1}^{z} (log P_{T - t} f) (x) . \end{matrix}

Theorem 4.

Denote

ϕ = ϕ_{a} + ϕ_{z}

; if the following condition is satisfied:

R ⪰ κ (Γ_{1} + Γ_{1}^{z}),

we then conclude

\begin{matrix} P_{T} (ϕ (\cdot, T)) (x) & \geq & ϕ (x, 0) + \int_{0}^{T} κ_{s} (Φ_{a} (x, s) + Φ_{z} (x, s)) d s, \end{matrix}

(67)

where

κ_{s}

depends on the estimate of the transition kernel

\nabla log ρ (s, \cdot, \cdot)

associated with semi-group

P_{s}

(see Definition 5).

Remark 16.

Based on Theorem 3, we can also prove the above theorem for operator

\tilde{L}

with the drift term involved. Since the proof is similar, we skip the proof here.

Proof.

Take

ϕ = ϕ_{a} + ϕ_{z}

. Let

{(X_{t}^{x})}_{t \geq 0}

be the diffusion Markov process with semigroup

P_{t}

. (Similar proofs can be found in [2] (Proposition 4.5).) Let smooth function

u : R^{n + m} \to R

be such that, for every

T > 0

,

{sup}_{t \in [0, T]} {∥ u (t, \cdot) ∥}_{\infty} < \infty

and

{sup}_{t \in [0, T]} {∥ \frac{1}{2} L u (t, \cdot) + \partial_{t} u (t, \cdot) ∥}_{\infty} < \infty

. We have for every

t > 0

\begin{matrix} u (t, X_{t}^{x}) & = & u (0, x) + \int_{0}^{T} (\frac{1}{2} L u + \partial_{s} u) (s, X_{s}^{x}) d s + M_{t}, \end{matrix}

where

{(M_{t})}_{t \geq 0}

is a local martingale. Let

T_{n}, n \in N

be an increasing sequence of stopping times such that, almost surely,

T_{n} \to \infty

and

{(M_{t \land T_{n}})}_{t \geq 0}

is a martingale. We obtain

\begin{matrix} E [u (t \land T_{n}, X_{t \land T_{n}}^{x})] & = & u (0, x) + E [\int_{0}^{t \land T_{n}} (\frac{1}{2} L u + \partial_{s} u) (s, X_{s}^{x}) d s] . \end{matrix}

By using the dominated convergence theorem, we obtain

\begin{matrix} E [u (t, X_{t}^{x})] & = & u (0, x) + E [\int_{0}^{t} (\frac{1}{2} L u + \partial_{s} u) (s, X_{s}^{x}) d s] . \end{matrix}

Applying the above equality to

ϕ (t, X_{t}^{x})

, we obtain

\begin{matrix} E [ϕ (t, X_{t}^{x})] & = & ϕ (0, x) + E [\int_{0}^{T} (\frac{1}{2} L ϕ + \partial_{s} ϕ) (s, X_{s}^{x}) d s] \\ = & ϕ (0, x) + \int_{0}^{T} E [(\frac{1}{2} L ϕ + \partial_{s} ϕ) (s, X_{s}^{x})] d s . \end{matrix}

We now look at the term

E [(\frac{1}{2} L ϕ + \partial_{s} ϕ) (s, X_{s}^{x})]

with

g (s, x) = (P_{T - s} f) (x) = E [f (X_{t}^{x})]

= \int ρ (x, y, s) f (y) d y

:

\begin{matrix} E [(\frac{1}{2} L ϕ + \partial_{s} ϕ) (s, X_{s}^{x})] & = & E [g Γ_{2} (log g, log g) + g Γ_{2}^{z} (log g, log g)] \\ + E [g Γ_{1} (log g, Γ_{1}^{z} (log g, log g)) - g Γ_{1}^{z} (log g, Γ_{1} (log g, log g))] . \end{matrix}

By using the above Lemma 21, let

h = log g

, and we obtain

\begin{matrix} E [(\frac{1}{2} L ϕ + \partial_{s} ϕ) (s, X_{s}^{x})] \\ = & \int g ρ (Γ_{2} (h, h) + Γ_{2}^{z} (h, h) + \frac{\nabla \cdot (ρ z z^{T} Γ_{\nabla (a a^{T})} (h, h))}{ρ} - \frac{\nabla \cdot (ρ a a^{T} Γ_{\nabla (z z^{T})} (h, h))}{ρ}) d y \\ = & \int g ρ (Γ_{2} (h, h) + {\tilde{Γ}}_{2}^{z, ρ} (h, h)) d y . \end{matrix}

Applying Theorem 3 here with

π = ρ (s, \cdot, \cdot)

as the transition kernel function, we obtain a time-dependent version of Theorem 3. Assume that the following bound is satisfied where the bound

κ_{s}

depends on kernel

ρ (s, \cdot, \cdot)

:

\begin{matrix} R (\nabla f, \nabla f) \geq κ_{s} (Γ_{1} (f, f) + Γ_{1}^{z} (f, f)) . \end{matrix}

We then conclude with the following bound:

\begin{matrix} E [(\frac{1}{2} L ϕ + \partial_{s} ϕ) (s, X_{s}^{x})] & \geq & \int ρ (s, x, y) g κ_{s} (Γ_{1} (h, h) (y) + Γ_{1}^{z} (h, h) (y)) d y \\ = & \int p (s, x, y) g κ_{s} (Γ_{1} (h, h) (y) + Γ_{1}^{z} (h, h) (y)) π (y) d y \\ \geq & P_{s} (κ_{s} g (Γ_{1} (log g, log g) + Γ_{1}^{z} (log g, log g))) . \end{matrix}

Plugging into the time integral

\int_{0}^{T} E [(\frac{1}{2} L ϕ + \partial_{s} ϕ) (s, X_{s}^{x})] d s

, the proof follows. □

Remark 17.

We prove the entropic inequality Theorem 4 in this section without the the assumption:

Γ_{1} (f, Γ_{1}^{z} (f, f)) = Γ_{1}^{z} (f, Γ_{1} (f, f))

. A similar entropic inequality under the assumption

Γ_{1} (f, Γ_{1}^{z} (f, f)) = Γ_{1}^{z} (f, Γ_{1} (f, f))

was first proven in [2] (Proposition 4.5 and Theorem 5.2). With this new inequality Theorem 4 in hand, similar gradient estimates and other inequalities from [2] follow. We leave them for future studies. Proposition 4.5 in [2] is based on a pointwise estimate given the commutative assumption of

Γ_{1}

and

Γ_{1}^{z}

. We removed the commutative assumption, and our estimate is in a weak form, which is presented in the above Lemma 21.

Author Contributions

Conceptualization, Q.F. and W.L.; methodology, Q.F. and W.L.; writing—original draft preparation, Q.F. and W.L.; writing—review and editing, Q.F. and W.L. All authors have read and agreed to the published version of the manuscript.

Funding

Wuchen Li is supported by AFOSR MURI FA9550-18-1-0502, the AFOSR YIP award: FA9550-23-1-0087, and NSF RTG: 2038080.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Degenerate SDEs and Sub-Riemannian Manifold

In this appendix, we briefly illustrate the formulation of the degenerate diffusion process and sub-Riemannian geometry.

For a smooth connected

n + m

-dimensional Riemannian manifold

M^{n + m}

, we denote

T M^{n + m}

as the tangent bundle of

M^{n + m}

and denote

τ

as a sub-bundle of

T M^{n + m}

. The sub-Riemannian structure associated with the sub-bundle

τ

on

M^{n + m}

is denoted as

(τ, g_{τ})

, where

g_{τ} (\cdot, \cdot)

is the metric associated with the sub-bundle

τ

. In particular, if we take distribution

τ

to be the horizontal sub-bundle, denoted as

H

, of the tangent bundle

T M^{n + m}

(see [2,51] for more details), then we denote the sub-Riemannian structure as

(M^{n + m}, H, g_{H})

. In this paper, we will not distinguish distributions

τ

and

H

and call this the horizontal sub-bundle. We assumed that the horizontal distribution

H

is bracket-generating (with any steps). The distribution

H

has dimension n.

For a vector field

b \in R^{n + m}

and a general matrix

a \in R^{(n + m) \times n}

, we denote

a = (a_{1}, a_{2}, \dots, a_{n})

with each

a_{i}, i = 1 \dots, n

, as an

n + m

-dimensional column vector. For any Stratonovich SDE,

\begin{matrix} d X_{t} = b (X_{t}) d t + \sqrt{2} \sum_{i = 1}^{n} a_{i} (X_{t}) \circ d B_{t}^{i}, \end{matrix}

(A1)

where

(B_{t}^{1}, B_{t}^{2}, \dots, B_{t}^{n})

is an n-dimensional Brownian motion in

R^{n}

and

a_{i}

has local coordinates

a_{i} (x) = \sum_{\hat{i} = 1}^{n + m} a_{\hat{i} i} (x) \frac{\partial}{\partial x_{\hat{i}}}

. We consider (A1) as the SDE associated with a given sub-Riemannian structure, which is defined through the Lie algebra spanned by the driving vector fields of the SDE

{a_{1}, a_{2}, \dots, a_{n}}

. In general, we assumed that

H : = {a_{1}, a_{2}, \dots, a_{n}}

is of rank n and satisfies the bracket-generating condition (or Hörmander condition). To be precise, for any

x \in M^{n + m}

, the Lie brackets of

{a_{1} (x), a_{2} (x)

,

\dots, a_{n} (x)}

span the whole tangent space at x with dimension

n + m

. We define the manifold

M^{n + m}

as the subspace of

R^{n + m}

, where the diffusion process

X_{t}

lives on. This spaces is described as the triple

(M^{n + m}, H, g_{H})

, and we denote

H

as the n-dimensional horizontal distribution of the tangent bundle

T M^{n + m}

generated by the vector fields

{(a_{1} (x), a_{2} (x), \dots, a_{n} (x)}

. In this paper, we considered the case where the generator of the diffusion process (A1) coincides with the horizontal Laplacian operator (or sub-Laplacian operator) associated with the sub-Riemannian structure

(M^{n + m}, H, g_{H})

. Furthermore, we assumed that there exists a symmetric and invariant volume measure associated with the horizontal Laplacian operator. The Stratonovich SDE (A1) without the drift (

d t

) term could be treated as a special case, where the horizontal Laplacian can be presented as the sum of squares of the horizontal vector fields in

H

. In particular, we considered the precise metric defined through the diffusion matrix a, which could be seen as an analogue for non-degenerate SDEs on Riemannian manifolds. The problem is that the rank of

a a^{T}

is n; thus, the

(n + m) \times (n + m)

matrix

a a^{T}

is degenerate and cannot serve as a metric. We thus introduce the following metric, which is to formulate this sub-Riemannian structure in Euclidean space.

Definition A1.

Consider an orthonormal basis

c = {c_{n + 1} (x), \dots, c_{n + m} (x)}

in

R^{n + m}

, such that

a_{i}^{T} c_{j} = 0

for any

1 \leq i \leq n

,

n + 1 \leq j \leq m + n

. We define a metric

g = {(a a^{T} + c c^{T})}^{- 1} = {(a a^{T})}^{†} + c c^{T}

and a metric on the horizontal sub-bundle

g_{τ} = {(a a^{T})}^{†}

, the pseudo-inverse of matrix

a a^{T}

, on manifold

M^{n + m}

.

The above definition is based on the following lemma.

Lemma A1.

The metric is

g = {(a a^{T} + c c^{T})}^{- 1} = {(a a^{T})}^{†} + c c^{T}

.

Proof.

For rank n matrix

a a^{T}

, we denote its eigenvalue decomposition and the corresponding pseudo-inverse

{(a a^{T})}^{†}

as

\begin{matrix} a a^{T} = \sum_{i = 1}^{n} λ_{i} V_{i} V_{i}^{T}, {(a a^{T})}^{†} = \sum_{i = 1}^{n} \frac{1}{λ_{i}} V_{i} V_{i}^{T} . \end{matrix}

Thus, we have

a a^{T} + c c^{T} = \sum_{i = 1}^{n} λ_{i} V_{i} V_{i}^{T} + \sum_{j = n + 1}^{n + m} c_{j} c_{j}^{T}

. Furthermore, we have

\begin{matrix} a a^{T} + c c^{T} \\ = & (V_{1}, \dots, V_{n}, c_{n + 1}, \dots, c_{n + m}) (\begin{matrix} Λ_{n} \\ I_{m} \end{matrix}) {(V_{1}, \dots, V_{n}, c_{n + 1}, \dots, c_{n + m})}^{T}, \\ {(a a^{T} + c c^{T})}^{- 1} \\ = & (V_{1}, \dots, V_{n}, c_{n + 1}, \dots, c_{n + m}) (\begin{matrix} Λ_{n}^{- 1} \\ I_{m} \end{matrix}) {(V_{1}, \dots, V_{n}, c_{n + 1}, \dots, c_{n + m})}^{T}, \end{matrix}

where we denote

Λ_{n} = diag (λ_{1}, \dots, λ_{n})

as the diagonal matrix for eigenvalues

λ_{i}^{'} s

and

I_{m}

as the m-dimensional identity matrix. Thus, the proof follows directly with

\begin{matrix} {(a a^{T} + c c^{T})}^{- 1} = \sum_{i = 1}^{n} \frac{1}{λ_{i}} V_{i} V_{i}^{T} + \sum_{j = n + 1}^{n + m} c_{j} c_{j}^{T} = {(a a^{T})}^{†} + c c^{T} . \end{matrix}

□

With the new metric introduced above, we have the following lemma.

Lemma A2.

The vectors

{a_{1}, \dots, a_{n}}

are the orthonormal basis under the metric

g = {(a a^{T})}^{†} + c c^{T}

.

Proof.

We just need to prove for

a = (a_{1}, \dots, a_{n})

with each

{(a_{i})}_{(n + m) \times 1}

, and we have

\begin{matrix} a^{T} g a = a^{T} {(a a^{T})}^{†} a = {Id}_{n \times n} . \end{matrix}

Notice that

a_{i}^{T} c_{j} = 0

, then we only need to prove

a^{T} {(a a^{T})}^{†} a = {Id}_{n \times n}

. Let us denote

a^{T} {(a a^{T})}^{†} a = B

, then we have

\begin{matrix} a a^{T} {(a a^{T})}^{†} a a^{T} & = & a B a^{T} \\ a a^{T} & = & a B a^{T} \\ a^{T} a a^{T} a & = & a^{T} a B a^{T} a \\ {(a^{T} a)}^{- 1} a^{T} a a^{T} a {(a^{T} a)}^{- 1} & = & B \\ {Id}_{n \times n} & = & B, \end{matrix}

where the second equality follows from the property of the pseudo-inverse matrix and the last step follows from the fact that

a^{T} a

is a non-degenerate

n \times n

matrix, hence invertible. The proof then follows directly. □

We are now ready to introduce the following definition.

Definition A2.

Define

(M^{n + m}, τ, g_{τ})

as the sub-Riemannian structure associated with the degenerate SDE (A1), where

g_{τ} = {(a a^{T})}^{†}

denotes the horizontal metric, i.e., metric g is restricted onthe horizontal bundle τ. We denote

\nabla^{R}

as the Levi-Civita connection on

M^{n + m}

associated with our metric

g = {(a a^{T})}^{†} + c c^{T}

, and let

P^{τ} \nabla^{R}

be the projection of the connection on the horizontal distribution τ. In particular, in our framework, we have

P^{τ} \nabla^{R} f = a a^{T} \nabla f

, for any function

f : M^{n + m} \to R

, where ∇ is the Euclidean gradient in

R^{n + m} .

Remark A1.

In Lemma A2, we show that

{a_{1}, a_{2}, \dots, a_{n}}

are the orthonormal basis for horizontal distribution τ under our metric g. In particular, we have

\begin{matrix} a a^{T} \nabla f = (a_{1}, \dots, a_{n}) (\begin{matrix} a_{1} \\ \dots \\ a_{n} \end{matrix}) f = \sum_{i = 1}^{n} (a_{i} f) a_{i} \in τ, \end{matrix}

which gives the local representation of

P^{τ} \nabla^{R} f .

To demonstrate the definition clearly, we give the following example. On the Heisenberg group

H^{1}

, we know that

X = \frac{\partial}{\partial x_{1}} - \frac{1}{2} x_{2} \frac{\partial}{\partial x_{3}}, Y = \frac{\partial}{\partial x_{2}} + \frac{1}{2} x_{1} \frac{\partial}{\partial x_{3}}, Z = \frac{\partial}{\partial x_{3}}

forms an orthonormal basis for the tangent bundle of

H^{1}

. In particular, X and Y generate the horizontal distribution

τ

. If we start with the following SDE:

\begin{matrix} d W_{t} = X \circ d B_{t}^{1} + Y \circ d B_{t}^{2}, \end{matrix}

(A2)

then we know

W_{t} = (B_{t}^{1}, B_{t}^{2}, \frac{1}{2} \int_{0}^{t} B_{s}^{1} d B_{s}^{2} - B_{s}^{2} d B_{s}^{1})

, which is the horizontal Brownian motion on the Heisenberg group

H^{1}

. The generator of the horizontal Brownian motion and the sub-Laplacian operator are the same, which is given by

Δ_{H} = X^{2} + Y^{2}

, and the volume measure associated with

Δ_{H}

is the Lebesgue measure on the Heisenberg group with the volume element equal to 1. Then,

W_{t}

is a diffusion process in

R^{3}

. In terms of our general sub-Riemannian structure introduced above, we can define

\begin{matrix} a = (\begin{matrix} 1 & 0 \\ 0 & 1 \\ - \frac{x_{2}}{2} & \frac{x_{1}}{2} \end{matrix}) = (a_{1}, a_{2}) = (X, Y), c = \frac{1}{\sqrt{\frac{x_{1}^{2}}{4} + \frac{x_{2}^{2}}{4} + 1}} (\begin{matrix} \frac{x_{2}}{2} \\ - \frac{x_{1}}{2} \\ 1 \end{matrix}), \end{matrix}

and

\begin{matrix} g_{H^{1}, τ} = {(a a^{T})}^{†} = {(\begin{matrix} 1 & 0 & - \frac{x_{2}}{2} \\ 0 & 1 & \frac{x_{1}}{2} \\ - \frac{x_{2}}{2} & \frac{x_{1}}{2} & \frac{x_{1}^{2} + x_{2}^{2}}{4} \end{matrix})}^{†}, g_{H^{1}} = {(a a^{T})}^{†} + c c^{T} . \end{matrix}

In particular, the horizontal gradient is given by

\begin{matrix} a a^{T} \nabla f = (\begin{matrix} X f \\ Y f \\ - \frac{x_{2}}{2} X f + \frac{x_{1}}{2} Y f \end{matrix}) = X f (\begin{matrix} 1 \\ 0 \\ - \frac{x_{2}}{2} \end{matrix}) + Y f (\begin{matrix} 0 \\ 1 \\ \frac{x_{1}}{2} \end{matrix}) = (X f) X + (Y f) Y . \end{matrix}

Thus, the sub-Riemannian structure associated with Stratonovich SDE (A2) is just

(H^{1}, τ, g_{H^{1}, τ})

, where

g_{H^{1}, τ}

is the restriction of metric

g_{H^{1}}

on the horizontal sub-bundle

τ

. Different from the standard construction of Brownian motion on a given Riemannian (sub-Riemannian) manifold by Ells–Elworthy–Malliavin [40,43], we can directly define our diffusion on the manifold

M^{n + m}

by (A1) without performing projection from the orthonormal frame bundles. This is because the new metrics

g = {(a a^{T})}^{†} + c c^{T}

and

{a_{1}, a_{2}, \dots, a_{n}}

are globally defined orthonormal basis of the (horizontal) sub-bundle on the tangent bundle

T M^{n + m}

. Essentially, we first define (A1) in

R^{n + m}

and then introduce the associated sub-Riemannian structure.

Remark A2.

Compared to the definition of the horizontal Brownian motion introduced in [38], the sub-Riemannian structure comes first with a totally geodesic Riemannian foliation structure, and then, SDE (A2) is defined on the given totally geodesic Riemannian foliation. In the current setting, we directly define the degenerate diffusion process by a first given matrix a, then we define the sub-Riemannian structure by introducing the new metric

{(a a^{T})}^{†} + c c^{T}

.

Proof of Gradient Flow Assumption

In this subsection, we demonstrate that Equation (6) is in fact a Fokker–Planck equation of SDE (A1).

Lemma A3.

Consider the drift–diffusion process:

\begin{matrix} d X_{t} = b (X_{t}) d t + \sqrt{2} a (X_{t}) \circ d B_{t}, \end{matrix}

(A3)

Suppose that b, a, π satisfy

\begin{matrix} a \otimes \nabla a - b = - a a^{T} \nabla log π . \end{matrix}

Then, the Fokker–Planck equation of

X_{t}

satisfies

\partial_{t} ρ (t, x) = \nabla \cdot (ρ (t, x) (a (x) a {(x)}^{T}) \nabla log \frac{ρ (t, x)}{π (x)}) .

Proof.

Recall that we denote

{a_{1}, \dots, a_{n}}

as the column vectors of matrix a. For Stratonovich SDE (A3), we can write

\begin{matrix} d X_{t} = b (X_{t}) d t + \sqrt{2} \sum_{i = 1}^{n} a_{i} (X_{t}) \circ d B_{t}^{i} . \end{matrix}

According to [28] (Appendix 7), the corresponding Itô SDE is

\begin{matrix} d X_{t} = \sqrt{2} \sum_{i = 1}^{n} a_{i} d B_{t}^{i} + (\sum_{i = 1}^{n} \nabla_{a_{i}} a_{i} + b) d t . \end{matrix}

Thus, the Fokker-Plank equation (Kolmogorov forward equation) satisfies

\begin{matrix} \partial_{t} ρ (t, x) & = & \sum_{i^{'} = 1}^{n + m} \sum_{j^{'} = 1}^{n + m} \frac{\partial^{2}}{\partial x_{i^{'}} \partial x_{j^{'}}} ({(a a^{T})}_{i^{'} j^{'}} ρ) - \nabla \cdot ((\sum_{i = 1}^{n} \nabla_{a_{i}} a_{i} + b) ρ) \\ = & \nabla \cdot (a a^{T} \nabla ρ) + \nabla \cdot (ρ {(\sum_{j^{'} = 1}^{n + m} \frac{\partial}{\partial x_{j^{'}}} {(a a^{T})}_{i^{'} j^{'}})}_{i^{'} = 1}^{n + m} - ρ \sum_{i = 1}^{n} \nabla_{a_{i}} a_{i} - b ρ) \\ = & \nabla \cdot (a a^{T} \nabla ρ) + \nabla \cdot (ρ (a \otimes \nabla a - b)) . \end{matrix}

Namely, we have

\begin{matrix} \partial_{t} ρ (t, x) = \nabla \cdot (a a^{T} \nabla ρ) + \nabla \cdot (ρ (a \otimes \nabla a - b)) . \end{matrix}

(A4)

Plugging in the relation

a \otimes \nabla a - b = - a a^{T} \nabla log π

, we have

\begin{matrix} \begin{matrix} \partial_{t} ρ (t, x) = & \nabla \cdot (a a^{T} \nabla ρ) - \nabla \cdot (ρ a a^{T} \nabla log π) \\ = & \nabla \cdot (ρ a a^{T} \nabla log ρ) - \nabla \cdot (ρ a a^{T} \nabla log π) \\ = & \nabla \cdot (ρ a a^{T} \nabla log \frac{ρ}{π}) . \end{matrix} \end{matrix}

(A5)

Here, we use the fact that

ρ \nabla log ρ = \nabla ρ .

This finishes the proof. □

Example A1.

The Lie group

SU (2)

is a compact connected Lie group, diffeomorphic to the three-sphere

S^{3}

. Following the construction of the left-invariant vector fields in [41] (Section 6.2), we change the coordinates in terms of coordinate system

(θ, ϕ, ψ)

. We obtain new left-invariant vector fields on

SU (2)

, with

\begin{matrix} X & = & cos ψ \frac{\partial}{\partial θ} + \frac{sin ψ}{sin θ} \frac{\partial}{\partial ϕ} - cos θ \frac{sin ψ}{sin θ} \frac{\partial}{\partial ψ}, \\ Y & = & - sin ψ \frac{\partial}{\partial θ} + \frac{cos ψ}{sin θ} \frac{\partial}{\partial ϕ} - cos θ \frac{cos ψ}{sin θ} \frac{\partial}{\partial ψ}, \\ Z & = & \frac{\partial}{\partial ψ} . \end{matrix}

Thus, we have

a = (a_{1}, a_{2}) = (X, Y)

in the new coordinate system. We define the metric

g = {(a a^{T})}^{†}

. Here,

X, Y

are the orthonormal basis for the horizontal bundle generated by

X, Y

under metric

{(a a^{T})}^{†}

. According to [41] (Lemma 6.4), the invariant measure on

SU (2)

has the form of

μ = sin (θ) d θ \land d ϕ \land d ψ

. It is easy to check that the above Lemma is satisfied for

b = 0

,

π = sin (θ)

, and

\begin{matrix} a a^{T} \nabla log π = - a \otimes \nabla a = (\begin{matrix} \frac{cos θ}{sin θ} \\ 0 \\ 0 \end{matrix}), \end{matrix}

where

\begin{matrix} a = (\begin{matrix} cos ψ & - sin ψ \\ \frac{sin ψ}{sin θ} & \frac{cos ψ}{sin θ} \\ - cos θ \frac{sin ψ}{sin θ} & - cos θ \frac{cos ψ}{sin θ} \end{matrix}) . \end{matrix}

References

Bakry, D.; Émery, M. Diffusions hypercontractives. In Séminaire de Probabilités XIX 1983/84; Springer: Berlin/Heidelberg, Germany, 1985; pp. 177–206. [Google Scholar]
Baudoin, F.; Garofalo, N. Curvature-dimension inequalities and Ricci lower bounds for sub-Riemannian manifolds with transverse symmetries. J. EMS 2017, 19, 151–219. [Google Scholar] [CrossRef]
Arnold, A.; Carlen, E. A generalized Bakry–Émery condition for non-symmetric diffusions. In Proceedings of the EQUADIFF 99—International Conference on Differential Equations, Berlin, Germany, 1–7 August 1999; pp. 732–734. [Google Scholar]
Li, W. Transport information geometry: Riemannian calculus on probability simplex. Inf. Geom. 2022, 5, 161–207. [Google Scholar] [CrossRef]
Otto, F. The geometry of dissipative evolution equations the porous medium equation. Commun. Partial Differ. Equ. 2001, 26, 101–174. [Google Scholar] [CrossRef]
Otto, F.; Villani, C. Generalization of an Inequality by Talagrand and Links with the Logarithmic Sobolev Inequality. J. Funct. Anal. 2000, 173, 361–400. [Google Scholar] [CrossRef]
Baudoin, F. Wasserstein contraction properties for hypoelliptic diffusions. arXiv 2016, arXiv:1602.04177. [Google Scholar]
Baudoin, F. Bakry–Émery meet Villani. J. Funct. Anal. 2017, 273, 2275–2291. [Google Scholar] [CrossRef]
Baudoin, F.; Bonnefont, M.; Garofalo, N. A sub-Riemannian curvature-dimension inequality, volume doubling property and the Poincaré inequality. Math. Ann. 2014, 358, 833–860. [Google Scholar] [CrossRef]
Baudoin, F.; Gordina, M.; Herzog, D.P. Gamma calculus beyond Villani and explicit convergence estimates for Langevin dynamics with singular potentials. Arch. Ration. Mech. Anal. 2021, 241, 765–804. [Google Scholar] [CrossRef]
Baudoin, F.; Grong, E.; Kuwada, K.; Thalmaier, A. Sub-Laplacian comparison theorems on totally geodesic Riemannian foliations. Calc. Var. 2019, 58, 130. [Google Scholar] [CrossRef]
Baudoin, F.; Wang, J. Curvature dimension inequalities and subelliptic heat kernel gradient bounds on contact manifolds. Potential Anal. 2014, 40, 163–193. [Google Scholar] [CrossRef]
Feng, Q. Harnack inequalities on totally geodesic foliations with transverse Ricci flow. arXiv 2017, arXiv:1712.02275. [Google Scholar]
Grong, E.; Thalmaier, A. Curvature-dimension inequalities on sub-Riemannian manifolds obtained from Riemannian foliations: Part I. Math. Z. 2015, 282, 99–130. [Google Scholar] [CrossRef]
Grong, E.; Thalmaier, A. Curvature-dimension inequalities on sub-Riemannian manifolds obtained from Riemannian foliations: Part II. Math. Z. 2015, 282, 131–164. [Google Scholar] [CrossRef]
Agrachev, A.; Lee, P. Optimal transportation under nonholonomic constraints. Trans. Am. Math. Soc. 2009, 361, 6019–6047. [Google Scholar] [CrossRef]
Figalli, A.; Rifford, L. Mass transportation on sub-Riemannian manifolds. Geom. Funct. Anal. 2010, 20, 124–159. [Google Scholar] [CrossRef]
Juillet, N. Diffusion by optimal transport in Heisenberg groups. Calc. Var. Partial Differ. Equ. 2014, 50, 693–721. [Google Scholar] [CrossRef]
Khesin, B.; Lee, P. A nonholonomic Moser theorem and optimal transport. J. Symplectic Geom. 2009, 7, 381–414. [Google Scholar] [CrossRef]
Lott, J.; Villani, C. Ricci Curvature for Metric-Measure Spaces via Optimal Transport. Ann. Math. 2009, 169, 903–991. [Google Scholar] [CrossRef]
Sturm, K.-T. On the Geometry of Metric Measure Spaces. Acta Math. 2006, 196, 65–131. [Google Scholar] [CrossRef]
Lafferty, J.D. The Density Manifold and Configuration Space Quantization. Trans. Am. Math. Soc. 1988, 305, 699–741. [Google Scholar] [CrossRef]
Jüngel, A. Entropy Methods for Diffusive Partial Differential Equations; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
Markowich, P.A.; Villani, C. On the Trend to Equilibrium for the Fokker–Planck Equation: An Interplay between Physics and Functional Analysis. Physics and Functional Analysis. Mat. Contemp. 1999, 19, 1–29. [Google Scholar]
Arnold, A.; Einav, A.; Wöhrer, T. On the rates of decay to equilibrium in degenerate and defective Fokker-Planck equations. J. Differ. Equ. 2018, 264, 6843–6872. [Google Scholar] [CrossRef]
Arnold, A.; Erb, J. Sharp entropy decay for hypocoercive and non-symmetric Fokker–Planck equations with linear drift. arXiv 2014, arXiv:1409.5425. [Google Scholar]
Karatzas, I.; Shreve, S.E. Brownian Motion and Stochastic Calculus, 2nd ed.; Graduate Texts in Mathematics; Springer: New York, NY, USA, 1991; Volume 113. [Google Scholar]
Baudoin, F. An Introduction to the Geometry of Stochastic Flows; World Scientific: Singapore, 2004. [Google Scholar]
Stroock, D.W. Partial differential equations for probabilists. In Cambridge Studies in Advanced Mathematics; Cambridge University Press: Cambridge, UK, 2008; Volume 112. [Google Scholar]
Bismut, J.M. Martingales, the Malliavin calculus and hypoellipticity under general Hörmander’s conditions. In Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete; Springer: Berlin/Heidelberg, Germany, 1981; Volume 56, pp. 469–505. [Google Scholar]
Hörmander, L. Hypoelliptic second-order differential equations. Acta Math. 1967, 119, 147–171. [Google Scholar] [CrossRef]
Arous, B.; Léandre, R. Décroissance exponentielle du noyau de la chaleur sur la diagonale (II). Probab. Theory Relat. Fields 1991, 90, 377–402. [Google Scholar] [CrossRef]
Barlow, M.; Nualart, D. Lectures on Probability Theory and Statistics. In Ecole d’Ete de Probabilites de Saint-Flour XXV; Springer: Berlin/Heidelberg, Germany, 1995. [Google Scholar]
Baudoin, F.; Nualart, E.; Ouyang, C.; Tindel, S. On probability laws of solutions to differential systems driven by a fractional Brownian motion. Ann. Probab. 2016, 44, 2554–2590. [Google Scholar] [CrossRef]
Feng, Q.; Li, W. Hypoelliptic entropy dissipation for stochastic differential equations. arXiv 2021, arXiv:2102.00544. [Google Scholar]
Agrachev, A.; Barilari, D.; Boscain, U. On the Hausdorff volume in sub-Riemannian geometry. Calc. Var. Partial Differ. Equ. 2012, 43, 355–388. [Google Scholar] [CrossRef]
Barilari, D.; Rizzi, L. A formula for Popp’s volume in sub-Riemannian geometry. Anal. Geom. Metr. Spaces 2013, 1, 42–57. [Google Scholar] [CrossRef]
Baudoin, F.; Feng, Q.; Gordina, M. Integration by parts and quasi-invariance for the horizontal Wiener measure on foliated compact manifolds. J. Funct. Anal. 2019, 277, 1362–1422. [Google Scholar] [CrossRef]
Eldredge, N.; Gordina, M.; Saloff-Coste, L. Left-invariant geometries on SU(2) are uniformly doubling. Geom. Funct. Anal. 2018, 28, 1321–1367. [Google Scholar] [CrossRef]
Elworthy, K.D. Stochastic Differential Equations on Manifolds; Cambridge University Press: Cambridge, UK, 1982; Volume 70. [Google Scholar]
Gordina, M.; Laetsch, T. Sub-Laplacians on sub-Riemannian manifolds. Potential Anal. 2016, 44, 811–837. [Google Scholar] [CrossRef]
Gordina, M.; Laetsch, T. A convergence to Brownian motion on sub-Riemannian manifolds. Trans. Am. Math. Soc. 2017, 369, 6263–6278. [Google Scholar] [CrossRef]
Malliavin, P. Stochastic Analysis. In Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]; Springer: Berlin/Heidelberg, Germany, 1997; Volume 313. [Google Scholar]
Baudoin, F.; Feng, Q. Log-Sobolev inequalities on the horizontal path space of a totally geodesic foliation. arXiv 2015, arXiv:1503.08180. [Google Scholar]
Inglis, J.; Papageorgiou, I. Logarithmic Sobolev inequalities for infinite-dimensional Hörmander type generators on the Heisenberg group. Potential Anal. 2009, 31, 79–102. [Google Scholar] [CrossRef]
Baudoin, F.; Bonnefont, M. Log-Sobolev inequalities for subelliptic operators satisfying a generalized curvature dimension inequality. J. Funct. Anal. 2012, 262, 2646–2676. [Google Scholar] [CrossRef]
Wang, F.-Y. Logarithmic Sobolev inequalities on noncompact Riemannian manifolds. Probab. Theory Relat. Fields 1997, 109, 417–424. [Google Scholar] [CrossRef]
Woit, P. Quantum Theory, Groups and Representations; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
Baudoin, F.; Cecil, M. The subelliptic heat kernel on the three-dimensional solvable Lie groups. Forum Math. 2015, 27, 2051–2086. [Google Scholar] [CrossRef]
Li, W. Diffusion Hypercontractivity via Generalized Density Manifold. arXiv 2019, arXiv:1907.12546. [Google Scholar]
Baudoin, F. Sub-Laplacians and hypoelliptic operators on totally geodesic Riemannian foliations. arXiv 2014, arXiv:1410.3268. [Google Scholar]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Feng, Q.; Li, W. Entropy Dissipation for Degenerate Stochastic Differential Equations via Sub-Riemannian Density Manifold. Entropy 2023, 25, 786. https://doi.org/10.3390/e25050786

AMA Style

Feng Q, Li W. Entropy Dissipation for Degenerate Stochastic Differential Equations via Sub-Riemannian Density Manifold. Entropy. 2023; 25(5):786. https://doi.org/10.3390/e25050786

Chicago/Turabian Style

Feng, Qi, and Wuchen Li. 2023. "Entropy Dissipation for Degenerate Stochastic Differential Equations via Sub-Riemannian Density Manifold" Entropy 25, no. 5: 786. https://doi.org/10.3390/e25050786

APA Style

Feng, Q., & Li, W. (2023). Entropy Dissipation for Degenerate Stochastic Differential Equations via Sub-Riemannian Density Manifold. Entropy, 25(5), 786. https://doi.org/10.3390/e25050786

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Entropy Dissipation for Degenerate Stochastic Differential Equations via Sub-Riemannian Density Manifold

Abstract

1. Introduction

2. Main Results

2.1. Setting

2.2. Main Result

3. Examples

3.1. Heisenberg Group

3.2. Displacement Group

3.3. Martinet Flat Sub-Riemannian Structure

4. Lyapunov Analysis in Sub-Riemannian Density Manifold

4.1. Sub-Riemannian Density Manifold

4.2. Gamma z Calculus via Second-Order Calculus of Relative Entropy in SDM

5. Generalized Gamma z Calculus

5.1. Proof of Lemma 10

5.2. Proof of Lemma 11

5.3. Proof of Lemma 12

6. Further Discussions on Other Inequalities

Author Contributions

Funding

Conflicts of Interest

Appendix A. Degenerate SDEs and Sub-Riemannian Manifold

Proof of Gradient Flow Assumption

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI