Next Article in Journal
Image Encryption Scheme Based on Orbital Shift Pixels Shuffling with ILM Chaotic System
Previous Article in Journal
Organizational Labor Flow Networks and Career Forecasting
Previous Article in Special Issue
Geometric Structures Induced by Deformations of the Legendre Transform
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Entropy Dissipation for Degenerate Stochastic Differential Equations via Sub-Riemannian Density Manifold

1
Department of Mathematics, University of Michigan, Ann Arbor, MI 48109, USA
2
Department of Mathematics, University of South Carolina, Columbia, SC 29208, USA
*
Author to whom correspondence should be addressed.
Entropy 2023, 25(5), 786; https://doi.org/10.3390/e25050786
Submission received: 29 March 2023 / Revised: 24 April 2023 / Accepted: 9 May 2023 / Published: 11 May 2023
(This article belongs to the Special Issue Information Geometry and Its Applications)

Abstract

:
We studied the dynamical behaviors of degenerate stochastic differential equations (SDEs). We selected an auxiliary Fisher information functional as the Lyapunov functional. Using generalized Fisher information, we conducted the Lyapunov exponential convergence analysis of degenerate SDEs. We derived the convergence rate condition by generalized Gamma calculus. Examples of the generalized Bochner’s formula are provided in the Heisenberg group, displacement group, and Martinet sub-Riemannian structure. We show that the generalized Bochner’s formula follows a generalized second-order calculus of Kullback–Leibler divergence in density space embedded with a sub-Riemannian-type optimal transport metric.

1. Introduction

Consider the following Stratonovich stochastic differential equation:
d X t = b ( X t ) d t + 2 a ( X t ) d B t ,
where ( B t 1 , B t 2 , , B t n ) is an n-dimensional Brownian motion in R n , a R n + m R ( n + m ) × n is a matrix-valued function, and b : R n + m R n + m is a drift vector field. The convergence analysis of SDE (1) to its invariant distribution lies in the intersection of differential geometry, analysis, the Lie group (subgroup in quantum mechanics), and probability. The convergence analysis also has broad applications in designing fast algorithms in artificial intelligence (AI) and Bayesian sampling/optimization problems. One key question arises: How fast does the probability density function of SDE (1) converge to its invariant distribution?
The Gamma calculus, also named Bakry–Émery iterative calculus [1], provides analytical approaches to derive the convergence rate for SDE (1). This lower bound is known as the Ricci curvature lower bound. However, classical studies are limited to the non-degenerate diffusion coefficient matrix a. The classical Gamma calculus is no longer valid when a is a degenerate matrix function; see the generalization of Bakry–Émery calculus in [2].
This paper presents a Lyapunov convergence analysis for the degenerate diffusion process. We selected a class of z-Fisher information as the Lyapunov functional, where z is a matrix function different from matrix a. We derived a generalized Gamma calculus by the dissipation of the Lyapunov functional along the diffusion process. We then derived the generalized Bochner’s formula and obtained the exponential convergence condition. Several concrete examples are presented: gradient-drift–diffusions on the Heisenberg group, the displacement group, and the Martinet sub-Riemannian structure. Our approach extends the classical optimal transport geometry, in particular the second-order calculus of the relative entropy in the density manifold studied in [3,4,5,6].
The generalized Gamma calculus was first introduced by Baudoin–Garofalo [2] for sub-Riemannian manifolds. Related results were studied later in [7,8,9,10,11,12,13,14,15]. The commutative property of the iteration of Γ 1 and Γ 1 z (Hypothesis 1.2 in [2]) was crucial in the previous works. Our algebraic Condition 1 does not have this requirement. We can remove this commutative condition in the weak sense. Thus, our results go beyond the step two-bracket-generating condition. We present algebraic conditions for the existence of the generalized Bochner’s formula.
On the other hand, optimal transport on the sub-Riemannian manifold was studied by [16,17,18,19]. An optimal transport metric on a sub-Riemannian manifold was proposed in [18,19]. In this case, the density manifold still forms an infinite-dimensional Riemannian manifold. The Monge–Ampère equation in sub-Riemannian settings was studied in [17]. Our approach is different. We introduced the sub-Riemannian density manifold (SDM) and studied its second-order geometric calculations of relative entropies in the SDM. Using those, we propose a new Gamma z calculus for degenerate stochastic differential equations and established the generalized curvature dimension-type bound. Besides, Refs. [20,21] used the analytical property of optimal transport to formulate the Ricci curvature lower bound in general metric space. Different from [20,21,22], we focused on the geometric calculations in the density manifold introduced by the z direction. Following the second-order geometric calculations in the density manifold, we formulated the new Gamma calculus and the corresponding Ricci curvature tensor for the sub-Riemannian manifold. Besides, our derivation also relates to the entropy methods [23,24]. Using entropy methods, Refs. [25,26] derived the convergence rate for degenerate drift–diffusion processes with constant diffusion coefficients a. Compared to previous works, we applied the entropy method with Gamma calculus and geometric calculations in the density manifold. It derives a generalized Gamma calculus from the dissipation of auxiliary Fisher information. Several concrete examples of convergence conditions are derived in the Lie-group-induced drift–diffusion processes.
We organize the paper as follows. We introduce the main result in Section 2. It is an explicit convergence rate condition for the density of degenerate SDEs in the L 1 distance. In Section 3, we provide three examples of the proposed convergence analysis, including gradient-drift–diffusions on the Heisenberg group, the displacement group, and the Martinet sub-Riemannian structure. In Section 4, we present the Lyapunov analysis in the sub-Riemannian density manifold. The generalized Gamma calculus and the proof of the generalized Bochner’s formula is presented in Section 5. Some further discussions for other functional inequalities are presented in Section 6.

2. Main Results

In this section, we present this paper’s setting and main results.

2.1. Setting

Consider a Stratonovich SDE:
d X t = b ( X t ) d t + 2 a ( X t ) d B t ,
where ( B t 1 , B t 2 , , B t n ) is an n-dimensional Brownian motion in R n , a : R n + m R ( n + m ) × n is a matrix-valued function, and b : R n + m R n + m is a vector field. We refer to [27] (Section 3.13) for the definition of the Stratonovich SDE. According to [28] (Appendix A.7), the SDE (2) can also be written as the following Itô SDE:
d X l , t = b ˜ l ( X t ) d t + i = 1 n 2 a l i ( X t ) d B t i , for l = 1 , , n + m ,
where
b ˜ l = b l + ( i = 1 n a i a i ) l , for l = 1 , , n + m .
We denote { a 1 , , a n } as the column vectors of matrix a, and i = 1 n a i a i R n + m represents
( i = 1 n a i a i ) l = i = 1 n k = 1 n + m a k i a l i x k , for l = 1 , , n + m .
We denote a T as the transpose of matrix a and denote { a 1 T , , a n T } as the row vectors of matrix a T . In particular, we have a i ^ i = a i i ^ T , for i = 1 , , n and i ^ = 1 , , n + m . With some abuse of notation, we also denote a i T as the vector fields corresponding to the row vectors a i T , for i = 1 , , n . We assumed that { a 1 T ( x ) , a 2 T ( x ) , , a n T ( x ) } satisfies the strong Hormander condition (or bracket-generating condition):
Span a 1 T ( x ) , , a n T ( x ) , [ a i 1 T , , [ a i k 1 T , a i k T ] ] ( x ) , 1 i 1 , , i k n , k 2 = R n + m ,
where [ · , · ] represents the Lie bracket between two vector fields. The strong Hörmander condition means that the Lie algebra generated by the vector fields { a 1 T ( x ) , , a n T ( x ) } is of full rank at every point x R n + m (see, e.g., [29] (Section 7.4)). This condition ensures the existence of a smooth probability density function of SDE (2); see the original proofs in [30,31]. For the simplicity of presentation, we assumed the probability density function is strictly positive. Indeed, the positivity of the density follows from the Hörmander condition [32]; for the more technical conditions to show the positivity by using Malliavin calculus, we refer to [33,34] (Theorem 1.4 with H = 1/2). Denote X t ρ ( t , x ) , where ρ = ρ ( t , x ) is the probability density function of SDE (2). The density function ρ satisfies the Fokker–Planck equation of SDE (2):
t ρ ( t , x ) = x · ( ρ ( t , x ) b ˜ ( x ) ) + i = 1 n + m j = 1 n + m 2 x i x j ( a ( x ) a ( x ) T ) i j ρ ( t , x ) ,
with a smooth initial condition:
ρ 0 ( x ) = ρ ( 0 , x ) , R n + m ρ 0 ( x ) d x = 1 , ρ 0 ( x ) > 0 .
In this paper, we assumed that SDE (2) has a unique invariant symmetric measure μ , where d μ = π ( x ) d x with π C ( R n + m ) . Here, π solves the equilibrium of Fokker–Planck Equation (6):
x · ( π ( x ) b ˜ ( x ) ) + i = 1 n + m j = 1 n + m 2 x i x j ( a ( x ) a ( x ) T ) i j π ( x ) = 0 .
We studied a particular class of the vector field b for a given invariant distribution π .
Assumption (Gradient flow formulation): Suppose that b, a, and π satisfy the relation:
b = a a + a a T log π ,
where a a R n + m represents, for k ^ = 1 , , n + m ,
( a a ) k ^ = k = 1 n k = 1 n + m a k ^ k x k a k k .
In the Itô formulation, b ˜ , a, and π satisfy
b ˜ l = ( a a T log π ) l + ( a a ) l + ( i = 1 n a i a i ) l ,
for l = 1 , , n + m . In this case, we can reformulate Equation (6) as
t ρ ( t , x ) = · ρ ( t , x ) a ( x ) a ( x ) T log ρ ( t , x ) π ( x ) .
We leave the derivation of Formula (9) in Appendix A. If ρ ( t , x ) = π ( x ) , then log ρ ( t , x ) π ( x ) = 0 , and π is an invariant density function for SDE (2). In Section 4, we demonstrate that Fokker–Planck Equation (6), or its equivalent Formulation (9), forms a “horizontal” gradient flow in the sub-Riemannian density manifold. We designed a Lyapunov functional to study the convergence behavior of this “horizontal” gradient flow (9).
Remark 1.
Formula (9) can be written as
t ( log ρ ( t , x ) log π ( x ) ) ρ ( t , x ) = · ρ ( t , x ) a ( x ) a ( x ) T log ρ ( t , x ) π ( x ) .
It has a weak formulation that
R n + m ( t log ρ ( t , x ) π ( x ) , ϕ ( x ) ) ρ ( t , x ) d x = R n + m ϕ ( x ) , a ( x ) a ( x ) T log ρ ( t , x ) π ( x ) ρ ( t , x ) d x ,
where ϕ C ( R n + m ) is a smooth test function.
Remark 2
(Non-gradient flow drift). In fact, the proposed method is not limited to the gradient flow assumption of the drift vector field b in (7). See the details in [35].

2.2. Main Result

We now briefly sketch the main results. Denote a sub-elliptic operator L : C ( R n + m ) C ( R n + m ) as follows:
L f = · ( a a T f ) a a , f R n + m + b , f R n + m ,
where f C ( R n + m ) .
Definition 1
(Generalized Gamma z calculus). Consider a smooth matrix function z : R n + m R ( n + m ) × m . Denote Gamma one bilinear forms Γ 1 , Γ 1 z : C ( R n + m ) × C ( R n + m ) C ( R n + m ) as
Γ 1 ( f , g ) = a T f , a T g R n , Γ 1 z ( f , g ) = z T f , z T g R m .
Define Gamma two bilinear forms Γ 2 , Γ 2 z , π : C ( R n + m ) × C ( R n + m ) C ( R n + m ) as
Γ 2 ( f , g ) = 1 2 L Γ 1 ( f , g ) Γ 1 ( L f , g ) Γ 1 ( f , L g ) ,
and
Γ 2 z , π ( f , g ) = 1 2 L Γ 1 z ( f , g ) Γ 1 z ( L f , g ) Γ 1 z ( f , L g )
+ div z π Γ 1 , ( a a T ) ( f , g ) div a π Γ 1 , ( z z T ) ( f , g ) .
Here, div a π , div z π are divergence operators defined by
div a π ( F ) = 1 π · ( π a a T F ) , div z π ( F ) = 1 π · ( π z z T F ) ,
for any smooth vector field F R n + m , and Γ ( a a T ) , Γ ( z z T ) are vector Gamma one bilinear forms defined as
Γ 1 , ( a a T ) ( f , g ) = f , ( a a T ) g = ( f , x k ^ ( a a T ) g ) k ^ = 1 n + m , Γ 1 , ( z z T ) ( f , g ) = f , ( a a T ) g = ( f , x k ^ ( z z T ) g ) k ^ = 1 n + m ,
with
div z π Γ ( a a T ) f , g = · ( z z T π f , ( a a T ) g ) π , div a π Γ ( z z T ) f , g = · ( a a T π f , ( z z T ) g ) π .
We next demonstrate that the summation of Γ 2 and Γ 2 z , π can induce the following decomposition and bilinear forms. They are natural extensions of the classical Bakry–Émery calculus in the Riemannian manifold, i.e., non-degenerate matrix function a.
Notation 1.
For matrix function a : R n + m R ( n + m ) × n , we define matrix Q as
Q = a 11 T a 11 T a 1 ( n + m ) T a 1 ( n + m ) T a i i ^ T a k k ^ T a n 1 T a n 1 T a n ( n + m ) T a n ( n + m ) T R n 2 × ( n + m ) 2 ,
with Q i k i ^ k ^ = a i i ^ T a k k ^ T . More precisely, for each row (respectively, column) of Q, the row (respectively column) indices of Q i k i ^ k ^ follow i = 1 n k = 1 n (respectively, i ^ = 1 n + 1 k ^ = 1 n + m ). For matrix function z : R n + m R ( n + m ) × m , we define matrix P as
P = z 11 T a 11 T z 1 ( n + m ) T a 1 ( n + m ) T z i i ^ T a k k ^ T z m 1 ^ T a n 1 ^ T z m ( n + m ) T a n ( n + m ) T R ( n m ) × ( n + m ) 2 ,
with P i k i ^ k ^ = z k k ^ T a i i ^ T . For smooth function f C ( R n + m ) , for any i ^ , k ^ , j ^ = 1 , , n + m and i , k = 1 , , n (or 1 , , m ) , we define vector C R ( n + m ) 2 × 1 with components
C i ^ k ^ = i , k = 1 n i = 1 n + m a i i ^ T a i i T ( a k k ^ T x i ) , ( a T ) k f R n a k i T a i k ^ T a i i ^ T x i , ( a T ) k f R n ,
where we denote ( a T ) k f = k = 1 n + m a k k T f x k . We define vector D R n 2 × 1 with components
D i k = i ^ , k ^ = 1 n + m a i i ^ T a k k ^ T x i ^ f x k ^ , and D T D = i , k D i k D i k .
We define vector F R ( n + m ) 2 × 1 with components
F i ^ k ^ = i = 1 n k = 1 m i = 1 n + m a i i ^ T a i i T ( z k k ^ T x i ) , ( z T ) k f R m z k i T a i k ^ T a i i ^ T x i , ( z T ) k f R m .
We define vector E R ( n × m ) × 1 with components
E i k = i ^ , k ^ = 1 n + m a i i ^ T z k k ^ T x i ^ f x k ^ , and E T E = i , k E i k E i k .
We define vector G R ( n + m ) 2 × 1 with components
G i ^ j ^ = i = 1 n j = 1 m j , j ^ , i , i ^ = 1 n + m z j j ^ T z j j T x j a i i ^ T a i i T f x i + z j j ^ T z j j T x j a i i T f x i a i i ^ T a i i ^ T a i i T x i z j j ^ T z j j T f x j + a i i ^ T a i i T x i z j j T f x j z j j ^ T .
We define X as the vectorization of the Hessian matrix of function f:
X T = 2 f x 1 x 1 2 f x i ^ x k ^ 2 f x n + m x n + m R 1 × ( n + m ) 2 .
Assumption 1.
Assume that there exists vectors Λ 1 , Λ 2 R ( n + m ) 2 × 1 such that
( Q T Q Λ 1 + P T P Λ 2 ) T X = ( F + C + G + Q T D + P T E ) T X .
Definition 2
(Hessian matrix). For smooth function f C ( R n + m ) , define a matrix function R : R n + m R ( n + m ) × ( n + m ) as
R ( f , f ) = Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E + ( R a b + R z b + R π ) ( f , f ) ,
where we define the following bilinear forms:
R a b ( f , f ) = R a ( f , f ) i = 1 n i ^ , k ^ = 1 n + m ( a i i ^ T b k ^ x i ^ f x k ^ b k ^ a i i ^ T x k ^ f x i ^ ) , ( a T f ) i R n , R a ( f , f ) = i , k = 1 n i , i ^ , k ^ = 1 n + m a i i T ( a i i ^ T x i a k k ^ T x i ^ f x k ^ ) , ( a T ) k f R n + i , k = 1 n i , i ^ , k ^ = 1 n + m a i i T a i i ^ T ( x i a k k ^ T x i ^ ) ( f x k ^ ) , ( a T ) k f R n i , k = 1 n i , i ^ , k ^ = 1 n + m a k k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) , ( a T ) k f R n i , k = 1 n i , i ^ , k ^ = 1 n + m a k k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ , ( a T ) k f R n , R z b ( f , f ) = R z ( f , f ) i = 1 m i ^ , k ^ = 1 n + m ( z i i ^ T b k ^ x i ^ f x k ^ b k ^ z i i ^ T x k ^ f x i ^ ) , ( z T f ) i R m , R z ( f , f ) = i = 1 n k = m i , i ^ , k ^ = 1 n + m a i i T ( a i i ^ T x i z k k ^ T x i ^ f x k ^ ) , ( z T ) k f R m + i = 1 n k = m i , i ^ , k ^ = 1 n + m a i i T a i i ^ T ( x i z k k ^ T x i ^ ) ( f x k ^ ) , ( z T ) k f R m i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m z k k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) , ( z T ) k f R m i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m z k k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ , ( z T ) k f R m ,
and
R π ( f , f ) = 2 k = 1 m i = 1 n k , k ^ , i ^ , i = 1 n + m x k z k k T z k k ^ T x k ^ a i i ^ T f x i ^ a i i T f x i + 2 k = 1 m i = 1 n k , k ^ , i ^ , i = 1 n + m z k k T x k z k k ^ T x k ^ a i i ^ T f x i ^ a i i T f x i + z k k T z k k ^ T 2 x k x k ^ a i i ^ T f x i ^ a i i T f x i + z k k T z k k ^ T x k ^ a i i ^ T f x i ^ x k a i i T f x i + 2 k = 1 m i = 1 n k ^ , i ^ , i = 1 n + m ( z T log π ) k z k k ^ T x k ^ a i i ^ T f x i ^ a i i T f x i 2 j = 1 m l = 1 n l , l ^ , j ^ , j = 1 n + m x l a l l T a l l ^ T x l ^ z j j ^ T f x j ^ z j j T f x j 2 j = 1 m l = 1 n l , l ^ , j ^ , j = 1 n + m a l l T x l a l l ^ T x l ^ z j j ^ T f x j ^ z j j T f x j + a l l T a l l ^ T 2 x l x l ^ z j j ^ T f x j ^ z j j T f x j + a l l T a l l ^ T x l ^ z j j ^ T f x j ^ x l z j j T f x j 2 j = 1 m l = 1 n l ^ , j ^ , j = 1 n + m ( a T log π ) l a l l ^ T x l ^ z j j ^ T f x j ^ z j j T f x j .
Here, we also denote R = R ( x ) R ( n + m ) × ( n + m ) , such that ( f ) T R ( x ) f = R ( f , f ) .
The main theorem is presented below, and its proof is postponed to Theorem 3 in Section 5.
Theorem 1
(Generalized z Bochner’s formula). If Assumption 1 is satisfied, then the following decomposition holds:
Γ 2 ( f , f ) + Γ 2 z , π ( f , f ) = Hess a , z f 2 + R ( f , f ) ,
where we define
Hess a , z f 2 = [ X + Λ 1 ] T Q T Q [ X + Λ 1 ] + [ X + Λ 2 ] T P T P [ X + Λ 2 ] , R ( f , f ) = Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E + R a b ( f , f ) + R z b ( f , f ) + R π ( f , f ) .
We are now ready to prove the convergence property of the degenerate drift–diffusion process (1) and related functional inequalities. Denote the Kullback–Leibler divergence as
D KL ( ρ π ) : = R n + m ρ ( x ) log ρ ( x ) π ( x ) d x .
Denote the a , z -relative Fisher information functional as
I a , z ( ρ ) : = R n + m ( log ρ π , a a T log ρ π ) ρ d x + R n + m ( log ρ π , z z T log ρ π ) ρ d x .
Theorem 2
(Exponential convergence in the L 1 distance). Suppose there exists a constant κ > 0 such that
R κ ( a a T + z z T ) .
Let ρ 0 be a smooth initial distribution and ρ = ρ ( t , x ) be the probability density function of (1). Then, ρ converges to the invariant measure π in the sense of
I a , z ( ρ ) e 2 κ t I a , z ( ρ 0 ) .
In addition,
R n + m | ρ ( t , x ) π ( x ) | d x 2 D KL ( ρ 0 π ) e κ t .
The proof of Theorem 2 is postponed to Proposition (14).
Remark 3
(Functional inequalities). Suppose R κ ( a a T + z z T ) with κ > 0 , then the z-log-Sobolev inequalities hold:
R n + m ρ log ρ π d x 1 2 κ I a , z ( ρ ) ,
for any smooth density function ρ.
Remark 4.
In the literature [2], the Γ 2 , z operator is defined by (10), i.e., Γ 2 z ( f , f ) = 1 2 L Γ 1 z ( f , f ) Γ 1 z ( L f , f ) . In fact, this definition is under the assumption of Γ 1 ( Γ 1 z ( f , f ) , f ) = Γ 1 z ( Γ 1 ( f , f ) , f ) . This assumption holds true only for the special choice of a and z. In the generalized Gamma z calculus, we introduce a new term (11), which removes the assumption Γ 1 ( Γ 1 z ( f , f ) , f ) = Γ 1 z ( Γ 1 ( f , f ) , f ) . In fact, in the paper, we show that (11) is exactly the new bilinear form behind the assumption in [2] by considering the weak form.
Remark 5.
Following [35] (Assumption 1), we know that, for any i { 1 , , n } and k { 1 , , m } , if
z k T a i T Span { a 1 T , , a n T } ,
there exist vectors Λ ^ 1 and Λ ^ 2 , such that the Hessian operator associated with the generator of the SDE and the metric ( a a T ) could be represented as
Hess f 2 = [ Q X + Λ ^ 1 ] T [ Q X + Λ ^ 1 ] + [ P X + Λ ^ 2 ] T [ P X + Λ ^ 2 ] .
Furthermore, we have the following relation:
[ Q X + Λ ^ 1 ] T [ Q X + Λ ^ 1 ] + [ P X + Λ ^ 2 ] T [ P X + Λ ^ 2 ] Λ ^ 1 T Λ ^ 1 Λ ^ 2 T Λ ^ 2 = [ X + Λ 1 ] T Q T Q [ X + Λ 1 ] + [ X + Λ 2 ] T P T P [ X + Λ 2 ] Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 ,
if there exist Λ 1 and Λ 2 as in Assumption 1 such that
Λ ^ 1 T = Λ 1 T Q T and Λ ^ 2 T = Λ 2 T P T .
Assumption 1 is true if Conditions 20 and 21 hold. See the detailed connections in [35] (Remark 11).

3. Examples

In this section, we consider the following degenerate drift–diffusion process:
d X t = a ( X t ) a ( X t ) T V ( X t ) d t + 2 a ( X t ) d B t ,
where a : R n + m R ( n + m ) × n is a matrix-valued function, for n , m Z + , and V C ( R n + m ) is a smooth potential function. We denote the invariant measure of SDE (22) as π . We further assumed that
a a T V = a a + a a T log π .
The above assumption holds for the later three examples.
Remark 6.
For V = 0 , the invariant measure π in the above assumption exists if { a 1 , , a n } forms left-invariant structures on unimodular Lie groups. In this case, the sub-Laplacian is the sum of squares of horizontal vector fields and the invariant measure is also symmetric. Stratonovich SDE (22) defines the horizontal Brownian motion on sub-Riemannian structure ( R n + m , τ , ( a a T ) | τ ) , and π is the volume form associated with the horizontal Laplacian. In general, if the Lie group structure is not unimodular, the drift b 0 . See the related studies about the diffusion process on general manifolds in [36,37,38,39,40,41,42,43]. See the related studies on log-Sobolev inequality in [44,45].
Remark 7.
It is also worth mentioning that many sub-Riemannian manifolds are non-compact. Hence, there may not exist a positive constant κ for both classical Γ 1 and Γ 1 z directions in the non-compact domain. The non-compactness of the domain brings additional difficulties. To prove the associated inequalities in this case, we need to extend the result derived in [46,47]. This is a direction for future work.
Remark 8.
It is known that the Heisenberg group is an example of Lie groups in quantum mechanics [48]. In future work, we shall investigate the general convergence analysis of SDEs in Lie groups and their connections with quantum SDEs.

3.1. Heisenberg Group

In this subsection, we apply our general theory to the well-known example in sub-Riemannian geometry, which is the Heisenberg group. A related LSI for the horizontal Wiener measure was studied in [46]. Recall briefly that the Heisenberg group H 1 admits left-invariant vector fields: X = x 1 2 y z , Y = y + 1 2 x z , Z = z . Here, { X , Y , Z } forms an orthonormal basis for the tangent bundle of H 1 . In this case, π = e V . In particular, X and Y generate the horizontal distribution τ . To fit into our general theory from the previous section, we take matrices a and z as below:
a T = 1 0 y / 2 0 1 x / 2 , z T = ( 0 , 0 , 1 ) .
In particular, we have
a T f = ( a T ) 1 f , ( a T ) 2 f T , ( a T ) 1 f = ( f x y 2 f z ) , ( a T ) 2 f = ( f y + x 2 f z ) .
We have the following proposition for Heisenberg group following Theorem 1.
Proposition 1.
For any smooth function f C ( H 1 ) , one has
Γ 2 ( f , f ) + Γ 2 z , π ( f , f ) = Hess a , z f 2 + R ( f , f ) ,
where
Λ 1 T = ( 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ) ; Λ 2 T = ( 0 , 0 , 0 , 0 , 0 , 0 , ( a T ) 2 f , ( a T ) 1 f , 0 ) ; R a b ( f , f ) Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E = Γ 1 ( f , f ) + 1 2 Γ 1 z ( f , f ) ( a T ) 1 V z f ( a T ) 2 f + ( a T ) 2 V z f ( a T ) 1 f + 2 V x x + y 2 4 2 V z z y 2 V x z | ( a T ) 1 f | 2 + 2 V y y + x 2 4 2 V z z + x 2 V y z | ( a T ) 2 f | 2 + 2 2 V x y + x 2 2 V x z y 2 2 V y z x y 4 2 V z z ( a T ) 1 f ( a T ) 2 f ; R z b ( f , f ) = 2 V x z y 2 2 V z z ( a T ) 1 f ( z T ) 1 f + 2 V y z + x 2 2 V z z ( z T ) 1 f ( a T ) 2 f ; R π ( f , f ) = 0 .
The proof of Proposition of 1 follows from the proof of Theorem 1 (i.e., Theorem 3) and Lemmas 1–3. The following convergence result follows directly from Theorem 2.
Proposition 2.
If there exists κ > 0 as shown in Theorem 2, the exponential dissipation result in the L 1 distance holds:
| ρ ( t , x ) π ( x ) | d x = O ( e κ t ) .
We next formulate the curvature tensor into a matrix format. Denote
U = ( a T ) 1 f , ( a T ) 2 f , ( z T ) 1 f 3 × 1 ,
and denote I 3 × 3 as the identity matrix. With a little abuse of notation, there exists a symmetric matrix R such that we can represent the tensor as below.
R ( f , f ) = ( U ) T · R · U ,
which implies that
R κ ( a a T + z z T ) R ( f , f ) κ ( Γ 1 ( f , f ) + Γ 1 z ( f , f ) ) .
In other words, we need to estimate the smallest eigenvalue of matrix R . We next present the formulation of matrix R for the Heisenberg group as follows.
Corollary 1.
The matrix R associated with the Heisenberg group has the following form:
R 11 = 2 V x x + y 2 4 2 V z z y 2 V x z 1 ; R 22 = 2 V y y + x 2 4 2 V z z + x 2 V y z 1 ; R 33 = 1 2 ; R 12 = R 21 = 2 V x y + x 2 2 V x z y 2 2 V y z x y 4 2 V z z ; R 13 = R 31 = 1 2 ( a T ) 2 V + 1 2 2 V x z y 2 2 V z z ; R 23 = R 32 = 1 2 ( a T ) 1 V + 1 2 2 V y z + x 2 2 V z z .
Proof. 
The explicit form of matrix R follows from the definition in Theorem 1 and the notation in (24) and (25). We have
R ( f , f ) = Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E + R a b ( f , f ) + R z b ( f , f ) + R π ( f , f ) = ( U ) T · R · U .
Plugging the explicit representation from Proposition 1 into the above formula and applying matrix symmetrization for the off-diagonal terms, we obtain the desired matrix R . □
Next, we present the three key lemmas.
Lemma 1.
For the Heisenberg group, we have
Q = 1 0 y 2 0 0 0 y 2 0 y 2 4 0 1 x 2 0 0 0 0 y 2 x y 4 0 0 0 1 0 y 2 x 2 0 x y 4 0 0 0 0 1 x 2 0 x 2 x 2 4 ; P = 0 0 0 0 0 0 1 0 y 2 0 0 0 0 0 0 0 1 x 2 ; D T = ( 0 , 1 2 z f , 1 2 z f , 0 ) ; E T = ( 0 , 0 ) ; F T = G T = ( 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ) ; C T = ( 0 , 0 , x 4 z f + 1 2 y f , 0 , 0 , y 4 z f 1 2 x f , x 4 z f + 1 2 y f , y 4 z f 1 2 x f , y 2 y f x 2 x f ) .
Proof. 
The proof of this lemma follows from routine computations. Plugging matrices a and z from (23) into Notation 1, we obtain the desired vectors and matrices. We skip the detailed computation here. □
Lemma 2.
On H 1 , vectors F and G are zero vectors, and we have
[ Q X + D ] T [ Q X + D ] + [ P X + E ] T [ P X + E ] + 2 C T X = Hess a , z f 2 Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E .
In particular, we have
Hess a , z f 2 = [ X + Λ 1 ] T Q T Q [ X + Λ 1 ] + [ X + Λ 2 ] T P T P [ X + Λ 2 ] ; Λ 1 T = ( 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ) ; Λ 2 T = ( 0 , 0 , 0 , 0 , 0 , 0 , ( a T ) 2 f , ( a T ) 1 f , 0 ) ; Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E ) = Γ 1 ( f , f ) + 1 2 Γ 1 z ( f , f ) .
Lemma 3.
By routine computations, we obtain
R a b ( f , f ) = ( a T ) 1 V z f ( a T ) 2 f + ( a T ) 2 V z f ( a T ) 1 f + 2 V x x + y 2 4 2 V z z y 2 V x z | ( a T ) 1 f | 2 + 2 V y y + x 2 4 2 V z z + x 2 V y z | ( a T ) 2 f | 2 + 2 2 V x y + x 2 2 V x z y 2 2 V y z x y 4 2 V z z ( a T ) 1 f ( a T ) 2 f ; R z b ( f , f ) = 2 V x z y 2 2 V z z ( a T ) 1 f ( z T ) 1 f + 2 V y z + x 2 2 V z z ( z T ) 1 f ( a T ) 2 f ; R π ( f , f ) = 0 .
Proof of Lemma 2.
We first have
2 C T X = i ^ , k ^ = 1 3 2 C i ^ k ^ T X i ^ k ^ = 2 2 f x z ( a T ) 2 f 2 2 f y z ( a T ) 1 f 2 + 2 f z x ( a T ) 2 f 2 2 2 f z y ( a T ) 1 f 2 + 2 f z z ( y 2 y f + x 2 x f ) = 2 2 f x z ( a T ) 2 f 2 2 f y z ( a T ) 1 f 2 2 f z z ( y 2 y f + x 2 x f ) = 2 ( a T ) 2 f 2 f x z y 2 2 f z z 2 ( a T ) 1 f 2 f y z + x 2 2 f z z .
By direct computations, we have
[ Q X + D ] T [ Q X + D ] + [ P X + E ] T [ P X + E ] + 2 C T X = 2 f x x y 2 f x z + y 2 4 2 f z z 2 + 2 f x y + x 2 2 f x z y 2 2 f y z x y 4 2 f z z + 1 2 z f 2 + 2 f x y y 2 2 f y z + x 2 2 f x z x y 4 2 f z z 1 2 z f 2 + 2 f y y + x 2 f y z + x 2 4 2 f z z 2 + 2 f x z y 2 2 f z z 2 + 2 f y z + x 2 2 f z z 2 + 2 ( a T ) 2 f 2 f x z y 2 2 f z z 2 ( a T ) 1 f 2 f y z + x 2 2 f z z .
Completing the squares for the cross terms involving the type of f 2 f ” and following the reformulation as below:
2 f x y + x 2 2 f x z y 2 2 f y z x y 4 2 f z z + 1 2 z f 2 + 2 f x y y 2 2 f y z + x 2 2 f x z x y 4 2 f z z 1 2 z f 2 = 2 2 f x y y 2 2 f y z + x 2 2 f x z x y 4 2 f z z 2 + 1 2 | z f | 2 ,
we have
[ Q X + D ] T [ Q X + D ] + [ P X + E ] T [ P X + E ] + 2 C T X = 2 f x x y 2 f x z + y 2 4 2 f z z 2 + 2 2 f x y y 2 2 f y z + x 2 2 f x z x y 4 2 f z z 2 + 2 f y y + x 2 f y z + x 2 4 2 f z z 2 + 2 f x z y 2 2 f z z + ( a T ) 2 f 2 + 2 f y z + x 2 2 f z z ( a T ) 1 f 2 | ( a T ) 2 f | 2 | ( a T ) 1 f | 2 + 1 2 | ( z T ) 1 f | 2 .
The sum of squares terms give Hess a , z F 2 , hence Λ 1 and Λ 2 . The remainders generate Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E , which equals Γ 1 ( f , f ) + 1 2 Γ 1 z ( f , f ) . □
We are now left to compute the tensors.
Proof of Lemma 3.
By direct computation, we have
R a ( f , f ) = i , k = 1 2 i , i ^ , k ^ = 1 3 a i i T ( a i i ^ T x i a k k ^ T x i ^ f x k ^ ) , ( a T ) k f R 2 + i , k = 2 2 i , i ^ , k ^ = 1 3 a i i T a i i ^ T ( x i a k k ^ T x i ^ ) ( f x k ^ ) , ( a T ) k f R 2 i , k = 1 2 i , i ^ , k ^ = 1 3 a k k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) , ( a T ) k f R 2 i , k = 1 2 i , i ^ , k ^ = 1 3 a k k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ , ( a T ) k f R 2 , = I 1 + I 2 + I 3 + I 4 .
For the four terms above, we have
I 1 = i = 1 2 i , i ^ , k ^ = 1 3 a i i T ( a i i ^ T x i a 1 k ^ T x i ^ f x k ^ ) ( a T ) 1 f + i = 1 2 i , i ^ , k ^ = 1 3 a i i T ( a i i ^ T x i a 2 k ^ T x i ^ f x k ^ ) ( a T ) 2 f = 0
I 2 = i = 2 n i , i ^ , k ^ = 1 3 a i i T a i i ^ T ( x i a 1 k ^ T x i ^ ) ( f x k ^ ) ( a T ) 1 f + i = 2 n i , i ^ , k ^ = 1 3 a i i T a i i ^ T ( x i a 2 k ^ T x i ^ ) ( f x k ^ ) ( a T ) 2 f = 0 I 3 = i = 1 2 i , i ^ , k ^ = 1 3 a 1 k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) ( a T ) 1 f i = 1 2 i , i ^ , k ^ = 1 3 a 2 k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) ( a T ) 2 f = 0 I 4 = i = 1 2 i , i ^ , k ^ = 1 3 a 1 k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ ( a T ) 1 f i = 1 2 i , i ^ , k ^ = 1 3 a 2 k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ ( a T ) 2 f = 0 .
Similar computation applies to the tensor terms R π and R z b . Since z is a constant matrix, we obtain
R z b ( f , f ) = i ^ , k ^ = 1 3 ( z 1 i ^ T b k ^ x i ^ f x k ^ b k ^ z 1 i ^ T x k ^ f x i ^ ) , ( z T f ) 1 R , R π = 0 .
We now compute the tensor terms involving the drift b. For the drift term in tensor R a b , taking b = a a T V , which means b = ( a k ^ k a k k T V x k ) k ^ = 1 , 2 , 3 in local coordinates,
R b a = i , k = 1 2 i ^ , k ^ , k = 1 3 a i i ^ T a k k ^ T x i ^ a k k T V x k f x k ^ ( a T ) i f + i , k = 1 2 i ^ , k ^ , k = 1 3 a i i ^ T a k k T x i ^ a k k ^ T V x k f x k ^ ( a T ) i f + i , k = 1 2 i ^ , k ^ , k = 1 3 a i i ^ T a k k ^ T a k k T 2 V x i ^ x k f x k ^ ( a T ) i f i , k = 1 2 i ^ , k ^ , k = 1 3 a k k ^ T a k k T a i i ^ T x k ^ V x k f x i ^ ( a T ) i f = J 1 + J 2 + J 3 + J 4 .
We now derive the explicit formulas for the above four terms.
J 1 = i ^ , k ^ , k = 1 3 a 1 i ^ T a 1 k ^ T x i ^ a 1 k T V x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 1 k ^ T x i ^ a 1 k T V x k f x k ^ ( a T ) 2 f + i ^ , k ^ , k = 1 3 a 1 i ^ T a 2 k ^ T x i ^ a 2 k T V x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 2 k ^ T x i ^ a 2 k T V x k f x k ^ ( a T ) 2 f = 1 2 ( a T ) 1 V z f ( a T ) 2 f + 1 2 ( a T ) 2 V z f ( a T ) 1 f ;
J 2 = i ^ , k ^ , k = 1 3 a 1 i ^ T a 1 k T x i ^ a 1 k ^ T V x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 1 k T x i ^ a 1 k ^ T V x k f x k ^ ( a T ) 2 f i ^ , k ^ , k = 1 3 a 1 i ^ T a 2 k T x i ^ a 2 k ^ T V x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 2 k T x i ^ a 2 k ^ T V x k f x k ^ ( a T ) 2 f = 1 2 V z ( a T ) 1 f ( a T ) 2 f + 1 2 V z ( a T ) 1 f ( a T ) 2 f = 0 ; J 3 = i ^ , k ^ , k = 1 3 a 1 i ^ T a 1 k ^ T a 1 k T 2 V x i ^ x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 1 k ^ T a 1 k T 2 V x i ^ x k f x k ^ ( a T ) 2 f i ^ , k ^ , k = 1 3 a 1 i ^ T a 2 k ^ T a 2 k T 2 V x i ^ x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 2 k ^ T a 2 k T 2 V x i ^ x k f x k ^ ( a T ) 2 f = i ^ , k = 1 3 a 1 i ^ T a 1 k T 2 V x i ^ x k | ( a T ) 1 f | 2 + a 2 i ^ T a 1 k T 2 V x i ^ x k ( a T ) 1 f ( a T ) 2 f i ^ , k = 1 3 a 1 i ^ T a 2 k T 2 V x i ^ x k ( a T ) 2 f ( a T ) 1 f + a 2 i ^ T a 2 k T 2 V x i ^ x k | ( a T ) 2 f | 2 = 2 V x x + y 2 4 2 V z z y 2 V x z | ( a T ) 1 f | 2 + 2 V y y + x 2 4 2 V z z + x 2 V y z | ( a T ) 2 f | 2 + 2 2 V x y + x 2 2 V x z y 2 2 V y z x y 4 2 V z z ( a T ) 1 f ( a T ) 2 f ; J 4 = i ^ , k ^ , k = 1 3 a 1 k ^ T a 1 k T a 1 i ^ T x k ^ V x k f x i ^ ( a T ) 1 f + a 1 k ^ T a 1 k T a 2 i ^ T x k ^ V x k f x i ^ ( a T ) 2 f i ^ , k ^ , k = 1 3 a 2 k ^ T a 2 k T a 1 i ^ T x k ^ V x k f x i ^ ( a T ) 1 f + a 2 k ^ T a 2 k T a 2 i ^ T x k ^ V x k f x i ^ ( a T ) 2 f = 1 2 ( a T ) 1 V z f ( a T ) 2 f + 1 2 ( a T ) 2 V z f ( a T ) 1 f .
Summing up the above formulas, we obtain R a b . We now compute the drift tensor term of R z b . By taking b = a a T V , we have
R b z ( f , f ) = i ^ , k ^ = 1 3 z 1 i ^ T b k ^ x i ^ f x k ^ ( z T f ) 1 b k ^ z i i ^ T x k ^ f x i ^ ( z T f ) 1 = k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k ^ T x i ^ a k k T V x k f x k ^ ( z T ) 1 f + k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k T x i ^ a k k ^ T V x k f x k ^ ( z T ) 1 f + k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k ^ T a k k T 2 V x i ^ x k f x k ^ ( z T ) 1 f k = 1 2 i ^ , k ^ , k = 1 3 a k k ^ T a k k T z 1 i ^ T x k ^ V x k f x i ^ ( z T ) 1 f = J 1 z + J 2 z + J 3 z + J 4 z .
We further compute as below by taking advantage of the constant matrix z:
J 1 z = k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k ^ T x i ^ a k k T V x k f x k ^ ( z T ) 1 f = 0 ; J 2 z = k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k T x i ^ a k k ^ T V x k f x k ^ ( z T ) 1 f = 0 ; J 4 z = k = 1 2 i ^ , k ^ , k = 1 3 a k k ^ T a k k T z 1 i ^ T x k ^ V x k f x i ^ ( z T ) 1 f = 0 J 3 z = k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k ^ T a k k T 2 V x i ^ x k f x k ^ ( z T ) 1 f = 2 V x z y 2 2 V z z ( a T ) 1 f ( z T ) 1 f + 2 V y z + x 2 2 V z z ( z T ) 1 f ( a T ) 2 f .
The proof is thus completed. □

3.2. Displacement Group

In this section, we derive the generalized curvature dimension bound for the displacement group, which is one example of three-dimensional solvable Lie groups. We adapted the general setting from [49] below. Denote g as the three-dimensional solvable Lie algebra, and denote H g as the horizontal subspace satisfying Hörmander’s condition, then for a given inner product · , · on H, there exists a canonical basis { X , Y , Z } for ( g , H , · , · ) , such that { X , Y } forms an orthonormal basis for H and satisfies the following Lie-bracket-generating condition for parameters α and β 0 :
[ X , Y ] = Z , [ X , Z ] = α Y + β Z , [ Y , Z ] = 0 .
When the parameters α = 0 and β 0 , the Lie algebra g has a faithful representation. In particular, it was shown in [49] that the elements of g , in local coordinates ( θ , x , y ) , correspond to the following left-invariant differential operators:
X = θ , Y = e β θ x + y , R = β y ,
with the following relation:
[ X , Y ] = β Y + R , [ X , R ] = 0 , [ Y , R ] = 0 .
In terms of local coordinates ( θ , x , y ) , we have
X = 1 0 0 , Y = 0 e β θ 1 , R = 0 0 β .
The corresponding Lie group of this special Lie algebra g is called the displacement group, denoted as G . We chose { X , Y } as the horizontal orthonormal basis for subalgebra H. To fit into the general framework from the previous section, we take
a = ( X , Y ) = 1 0 0 e β θ 0 1 , a T = 1 0 0 0 e β θ 1 , z T = 0 0 g ( θ , x , y ) ,
with g ( θ , x , y ) 0 . Our focus here is to derive the curvature tensor in terms of π = 1 Z e V . We then used ( a a T ) | H as the horizontal metric on H. Thus, the sub-Riemannian structure is given by ( G , H , ( a a T ) | H ) . By direct computations, it is easy to show that, for general smooth function f, Γ 1 ( f , Γ 1 z ( f , f ) ) Γ 1 z ( f , Γ 1 ( f , f ) ) . Hence, the classical Gamma z calculus proposed in [2] can not be extended for this case to derive the zLSI. Thus, we need to compute vector G and the tensor term R π . Following Theorem 1, we have the following z-Bochner’s formula for G .
Proposition 3.
For any smooth function f C ( G ) , one has
Γ 2 ( f , f ) + Γ 2 z , π ( f , f ) = Hess a , z f 2 + R ( f , f ) ,
where
Λ 1 T = ( 0 , β x f , β y f 2 , β x f , 0 , 0 , β y f 2 , 0 , β θ f ) ; Λ 2 T = ( 0 , 0 , 0 , 0 , 0 , 0 , λ 6 , 0 , λ 9 ) ; λ 6 = θ g y f g β ( a T ) 2 f g 2 θ g y f g ; λ 9 = ( a T ) 2 g y f g + β θ f g 2 ( a T ) 2 g y f g ;
and
R a b ( f , f ) Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E ) = Γ 1 ( log g , log g ) Γ 1 z ( f , f ) β 2 ( 1 + 1 g 2 ) Γ 1 ( f , f ) + β 2 2 g 2 Γ 1 z ( f , f ) + β 2 e β θ f x ( a T ) 2 f + β e β θ ( a T ) 2 V f x ( a T ) 1 f + β e β θ V x ( a T ) 2 f ( a T ) 1 f + 2 V θ θ | ( a T ) 1 f | 2 + 2 ( e β θ 2 V θ x + 2 V θ y ) ( a T ) 1 f ( a T ) 2 f + i ^ , k = 1 3 a 2 i ^ T a 2 k T 2 V x i ^ x k ) | ( a T ) 2 f | 2 β e β θ ( a T ) 1 V f x ( a T ) 2 f ; R z b ( f , f ) = i = 1 2 i , i ^ = 1 3 a i i T a i i ^ T 2 z 13 T x i x i ^ y f ( z T ) 1 f k = 1 2 ( a T ) k z 13 T ( a T ) k V y f ( z T ) 1 f g 2 V θ y ( a T ) 1 f ( z T ) 1 f g ( e β θ 2 V x y + 2 V y y ) ( a T ) 2 f ( z T ) 1 f ; R π ( f , f ) = 2 l = 1 2 l , l ^ = 1 3 a l l T a l l ^ T 2 z 13 T x l x l ^ y f ( z T ) 1 f 2 Γ 1 ( log π , log g ) | ( z T ) 1 f | 2 2 Γ 1 ( log g , log g ) | ( z T ) 1 f | 2 .
In particular, we have
i ^ , k = 1 3 a 2 i ^ T a 2 k T 2 V x i ^ x k | ( a T ) 2 f | 2 = e 2 β θ 2 V x x + 2 e β θ 2 V x y + 2 V y y | ( a T ) 2 f | 2 ;
i = 1 2 i , i ^ = 1 3 a i i T a i i ^ T 2 z 13 T x i x i ^ y f ( z T ) 1 f = 2 g θ θ + e 2 β θ 2 g x x + 2 g y y + 2 e β θ 2 g x y | ( z T ) 1 f | 2 g .
The proof of Proposition 3 follows from the proof of Theorem 1 (i.e., Theorem 3) and Lemmas 4–6 below. The following convergence result follows directly from Theorem 2.
Proposition 4.
If there exists κ > 0 as shown in Theorem 2, the exponential dissipation result in the L 1 distance holds:
| ρ ( t , x ) π ( x ) | d x = O ( e κ t ) .
Similarly, we formulated the curvature tensor into a matrix format of R . Using the fact e β θ f x = ( a T ) 2 f + 1 g ( z T f ) 1 f , we have the following representation.
Corollary 2.
The matrix R associated with G has the following representation:
R 11 = 2 V θ θ β 2 ( 1 + 1 g 2 ) ; R 22 = e 2 β θ 2 V x x + 2 e β θ 2 V x y + 2 V y y β 2 g 2 β ( a T ) 1 V ; R 33 = β 2 2 g 2 Γ 1 ( log g , log g ) 2 Γ 1 ( log π , log g ) Γ 1 ( log g , V ) 1 g 2 g θ θ + e 2 β θ 2 g x x + 2 g y y + 2 e β θ 2 g x y ; R 12 = R 21 = 1 2 β e β θ V x + 2 ( e β θ 2 V θ x + 2 V θ y ) + β ( a T ) 2 V ; R 13 = R 31 = 1 2 β g ( a T ) 2 V g 2 V θ y ; R 23 = R 32 = 1 2 β g ( a T ) 1 β 2 g 1 2 g ( e β θ 2 V x y + 2 V y y ) .
Proof. 
The derivation for the explicit form of matrix R follows from a similar equivalent representation as shown in the proof of Corollary 1 and the explicit bilinear terms derived in Proposition 3. □
Remark 9.
By taking g ( θ , x , y ) = β as a constant, Proposition 3 reduces to a simple version; in particular, the tensors reduce to be
R a b ( f , f ) Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E ) = ( 1 + β 2 ) Γ 1 ( f , f ) + 1 2 Γ 1 z ( f , f ) + β 2 e β θ f x ( a T ) 2 f + β e β θ ( a T ) 2 V f x ( a T ) 1 f + β e β θ V x ( a T ) 2 f ( a T ) 1 f + 2 V θ θ | ( a T ) 1 f | 2 + 2 ( e β θ 2 V θ x + 2 V θ y ) ( a T ) 1 f ( a T ) 2 f + i ^ , k = 1 3 a 2 i ^ T a 2 k T 2 V x i ^ x k ) | ( a T ) 2 f | 2 β e β θ ( a T ) 1 V f x ( a T ) 2 f ;
R z b ( f , f ) = β 2 V θ y ( a T ) 1 f ( z T ) 1 f β ( e β θ 2 V x y + 2 V y y ) ( a T ) 2 f ( z T ) 1 f ; R π ( f , f ) = 0 .
Next, we present the following three key lemmas.
Lemma 4.
For displacement group G , we have
Q = 1 0 0 0 0 0 0 0 0 0 e β θ 1 0 0 0 0 0 0 0 0 0 e β θ 0 0 1 0 0 0 0 0 0 e 2 β θ e β θ 0 e β θ 1 ; P = 0 0 0 0 0 0 g ( θ , x , y ) 0 0 0 0 0 0 0 0 0 g ( θ , x , y ) e β θ g ( θ , x , y ) ; D T = ( 0 , β e β θ x f , 0 , 0 ) , E T = ( y f θ g , y f y g e β θ y f x g ) ; C T = ( 0 , β e β θ y f + β e 2 β θ x f , 0 , 0 , β e 2 β θ θ f , β e β θ θ f , 0 , 0 , 0 ) .
F = 0 0 g θ g y f 0 0 e β θ g y f y g + e 2 β θ g y f x g 0 0 g y f y g + e β θ g y f x g , G = 0 0 2 g y f θ g 0 0 2 e β θ g y f y g 2 e 2 β θ g y f x g 0 0 2 g y f y g 2 e β θ g y f x g .
Proof. 
The proof follows by plugging matrices a and z from (26) into Notation 1. □
Lemma 5.
On displacement group G , we have
[ Q X + D ] T [ Q X + D ] + [ P X + E ] T [ P X + E ] + 2 [ C T + F T + G T ] X = Hess a , z f 2 Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E .
In particular, we have
Hess a , z f 2 = [ X + Λ 1 ] T Q T Q [ X + Λ 1 ] + [ X + Λ 2 ] T P T P [ X + Λ 2 ] ; Λ 1 T = ( 0 , β x f , β y f 2 , β x f , 0 , 0 , β y f 2 , 0 , β θ f ) ; Λ 2 T = ( 0 , 0 , 0 , 0 , 0 , 0 , λ 6 , 0 , λ 9 ) ; λ 6 = θ g y f g β ( a T ) 2 f g 2 θ g y f g ;
λ 9 = ( a T ) 2 g y f g + β θ f g 2 ( a T ) 2 g y f g ; Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E = Γ 1 ( log g , log g ) Γ 1 z ( f , f ) β 2 ( 1 + 1 g 2 ) Γ 1 ( f , f ) + β 2 2 g 2 Γ 1 z ( f , f ) .
Lemma 6.
By routine computations, we obtain
R a b ( f , f ) = β 2 e β θ f x ( a T ) 2 f + β e β θ ( a T ) 2 V f x ( a T ) 1 f + β e β θ V x ( a T ) 2 f ( a T ) 1 f + 2 V θ θ | ( a T ) 1 f | 2 + 2 ( e β θ 2 V θ x + 2 V θ y ) ( a T ) 1 f ( a T ) 2 f + i ^ , k = 1 3 a 2 i ^ T a 2 k T 2 V x i ^ x k | ( a T ) 2 f | 2 β e β θ ( a T ) 1 V f x ( a T ) 2 f ; R z b ( f , f ) = i = 1 2 i , i ^ = 1 3 a i i T a i i ^ T 2 z 1 k ^ T x i x i ^ y f ( z T ) 1 f k = 1 2 ( a T ) k z 13 T ( a T ) k V y f ( z T ) 1 f g 2 V θ y ( a T ) 1 f ( z T ) 1 f g ( e β θ 2 V x y + 2 V y y ) ( a T ) 2 f ( z T ) 1 f ; R π ( f , f ) = 2 l = 1 2 l , l ^ = 1 3 a l l T a l l ^ T 2 z 13 T x l x l ^ y f ( z T ) 1 f 2 l = 1 2 l , l ^ = 1 3 a l l T a l l ^ T z 13 T x l ^ z 13 T x l | y f | 2 2 l = 1 2 l ^ = 1 3 ( a T ) l log π a l l ^ T z 13 T x l ^ y f ( z T ) 1 f .
Proof of Lemma 5.
According to Lemma 4 and observing the fact that G = 2 F and
( a T ) 2 f = e β θ x f + y f , we first have
2 C T X = 2 [ β e β θ y f + β e 2 β θ x f ] 2 f θ x + 2 [ β e 2 β θ θ f ] 2 f x x + 2 [ β e β θ θ f ] 2 f x y ; 2 [ F T + G T ] X = 2 g θ g y f 2 f θ y + e β θ g ( a T ) 2 g y f 2 f x y + g ( a T ) 2 g y f 2 f y y .
By direct computations, we have
[ Q X + D ] T [ Q X + D ] + [ P X + E ] T [ P X + E ] + 2 C T X + 2 F T X + 2 G T X = 2 f θ θ 2 + e 2 β θ 2 f x x + 2 e β θ 2 f x y + 2 f y y 2 + e β θ 2 f θ x + 2 f θ y + β e β θ f x 2 + e β θ 2 f θ x + 2 f θ y 2 + g 2 f θ y y f θ g 2 + g e β θ 2 f x y g 2 f y y ( a T ) 2 g y f 2 + 2 [ β e β θ y f + β e 2 β θ x f ] 2 f θ x + 2 [ β e 2 β θ θ f ] 2 f x x + 2 [ β e β θ θ f ] 2 f x y 2 g θ g y f 2 f θ y + e β θ g ( a T ) 2 g y f 2 f x y + g ( a T ) 2 g y f 2 f y y = 2 f θ θ 2 + e 2 β θ 2 f x x + 2 e β θ 2 f x y + 2 f y y 2 + e β θ 2 f θ x + 2 f θ y + β e β θ f x 2
+ e β θ 2 f θ x + 2 f θ y 2 + g 2 f θ y y f θ g 2 + g e β θ 2 f x y g 2 f y y ( a T ) 2 g y f 2 + 2 β ( a T ) 2 f e β θ 2 f θ x + 2 f θ y 2 β ( a T ) 2 f 2 f θ y 2 g θ g y f 2 f θ y 2 β θ f 2 e β θ 2 f x y + e 2 β θ 2 f x x + 2 f y y + 2 β θ f e β θ 2 f x y + 2 f y y 2 g ( a T ) 2 g y f e β θ 2 f x y + 2 f y y .
Completing the squares for the above terms, we have
[ Q X + D ] T [ Q X + D ] + [ P X + E ] T [ P X + E ] + 2 C T X + 2 F T X + 2 G T X = 2 f θ θ 2 + e 2 β θ 2 f x x + 2 e β θ 2 f x y + 2 f y y β θ f 2 β 2 | θ f | 2 + e β θ 2 f θ x + 2 f θ y + β e β θ f x 2 + e β θ 2 f θ x + 2 f θ y + β ( a T ) 2 f 2 β 2 | ( a T ) 2 f | 2 + g 2 f θ y + θ g y f β ( a T ) 2 f g θ g y f 2 β ( a T ) 2 f g + θ g y f 2 + g e β θ 2 f x y + g 2 f y y + ( a T ) 2 g y f + β θ f g ( a T ) 2 g y f 2 β θ f g ( a T ) 2 g y f 2 + 2 β ( a T ) 2 f g + θ g y f θ g y f 2 y f ( a T ) 2 g × β θ f g ( a T ) 2 g y f .
The first-order terms generate Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E , and the sum of squares terms generate vectors Λ 1 and Λ 2 . We further formulate the above two terms as below:
e β θ 2 f θ x + 2 f θ y + β e β θ f x 2 + e β θ 2 f θ x + 2 f θ y + β ( a T ) 2 f 2 = 2 e β θ 2 f θ x + 2 f θ y + β e β θ f x + β 2 y f 2 + β 2 2 | y f | 2 .
Adding β 2 2 | y f | 2 into the term Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E again, we further expand as below:
Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E = β 2 [ | θ f | 2 + | ( a T ) 2 f | 2 ] β θ f g ( a T ) 2 g y f 2 β ( a T ) 2 f g + θ g y f 2 + 2 [ β ( a T ) 2 f g + θ g y f ] θ g y f 2 y f ( a T ) 2 g × β θ f g ( a T ) 2 g y f + β 2 2 | y f | 2
= β 2 Γ 1 ( f , f ) β 2 g 2 | ( a T ) 1 f | 2 | ( a T ) 2 ( log g ) | 2 | ( z T ) 1 f | 2 2 β g ( a T ) 2 log g ( a T ) 1 f ( z T ) 1 f β 2 g 2 | ( a T ) 2 f | 2 | ( a T ) 1 log g | 2 | ( z T ) 1 f | 2 + 2 β g ( a T ) 1 log g ( a T ) 2 f ( z T ) 1 f 2 β g ( a T ) 1 log g ( a T ) 2 f ( z T ) 1 f + 2 | ( a T ) 1 log g | 2 | ( z T ) 1 f | 2 + 2 β g ( a T ) 2 log g ( a T ) 1 f ( z T ) 1 f + 2 | ( a T ) 2 log g | 2 | ( z T ) 1 f | 2 + β 2 2 g 2 Γ 1 z ( f , f ) .
By grouping the bilinear terms of f , we obtain
Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E = Γ 1 ( log g , log g ) Γ 1 z ( f , f ) β 2 ( 1 + 1 g 2 ) Γ 1 ( f , f ) + β 2 2 g 2 Γ 1 z ( f , f ) .
We are now left to compute the three tensor terms.
Proof of Lemma 6.
For displacement group G , we have n = 2 and m = 1 . Recall Theorem 1; we denote R a b ( f , f ) = R a ( f , f ) + R b ( f , f ) , where R b ( f , f ) represents the tensor term involving drift b. We thus have
R a ( f , f ) = i , k = 1 2 i , i ^ , k ^ = 1 3 a i i T ( a i i ^ T x i a k k ^ T x i ^ f x k ^ ) , ( a T ) k f R 2 + i , k = 2 n i , i ^ , k ^ = 1 3 a i i T a i i ^ T ( x i a k k ^ T x i ^ ) ( f x k ^ ) , ( a T ) k f R 2 i , k = 1 2 i , i ^ , k ^ = 1 3 a k k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) , ( a T ) k f R 2 i , k = 1 2 i , i ^ , k ^ = 1 3 a k k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ , ( a T ) k f R 2 , = I 1 + I 2 + I 3 + I 4 .
By direct computations, we have
I 1 = i = 1 2 i , i ^ , k ^ = 1 3 [ a i i T ( a i i ^ T x i a 1 k ^ T x i ^ f x k ^ ) ( a T ) 1 f + a i i T ( a i i ^ T x i a 2 k ^ T x i ^ f x k ^ ) ( a T ) 2 f ] = 0 ;
I 2 = i = 2 n i , i ^ , k ^ = 1 3 [ a i i T a i i ^ T ( x i a 1 k ^ T x i ^ ) ( f x k ^ ) ( a T ) 1 f + a i i T a i i ^ T ( x i a 2 k ^ T x i ^ ) ( f x k ^ ) ( a T ) 2 f ] = a 11 T a 11 T 2 θ θ a 22 T f x ( a T ) 2 f = β 2 e β θ f x ( a T ) 2 f ; I 3 = i = 1 2 i , i ^ , k ^ = 1 3 [ a 1 k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) ( a T ) 1 f + a 2 k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) ( a T ) 2 f ] = 0 ; I 4 = i = 1 2 i , i ^ , k ^ = 1 3 [ a 1 k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ ( a T ) 1 f + a 2 k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ ( a T ) 2 f ] = 0 .
For the drift term in tensor R a b , taking b = a a T V , we obtain
R b a = i , k = 1 2 i ^ , k ^ , k = 1 3 a i i ^ T a k k ^ T x i ^ a k k T V x k f x k ^ ( a T ) i f + i , k = 1 2 i ^ , k ^ , k = 1 3 a i i ^ T a k k T x i ^ a k k ^ T V x k f x k ^ ( a T ) i f + i , k = 1 2 i ^ , k ^ , k = 1 3 a i i ^ T a k k ^ T a k k T 2 V x i ^ x k f x k ^ ( a T ) i f i , k = 1 2 i ^ , k ^ , k = 1 3 a k k ^ T a k k T a i i ^ T x k ^ V x k f x i ^ ( a T ) i f = J 1 + J 2 + J 3 + J 4 .
Plugging into the matrix a T , we obtain
J 1 = i ^ , k ^ , k = 1 3 a 1 i ^ T a 1 k ^ T x i ^ a 1 k T V x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 1 k ^ T x i ^ a 1 k T V x k f x k ^ ( a T ) 2 f + i ^ , k ^ , k = 1 3 a 1 i ^ T a 2 k ^ T x i ^ a 2 k T V x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 2 k ^ T x i ^ a 2 k T V x k f x k ^ ( a T ) 2 f = β e β θ ( a T ) 2 V f x ( a T ) 1 f ;
J 2 = i ^ , k ^ , k = 1 3 a 1 i ^ T a 1 k T x i ^ a 1 k ^ T V x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 1 k T x i ^ a 1 k ^ T V x k f x k ^ ( a T ) 2 f + i ^ , k ^ , k = 1 3 a 1 i ^ T a 2 k T x i ^ a 2 k ^ T V x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 2 k T x i ^ a 2 k ^ T V x k f x k ^ ( a T ) 2 f = β e β θ V x ( a T ) 2 f ( a T ) 1 f ;
J 3 = i ^ , k = 1 3 a 1 i ^ T a 1 k T 2 V x i ^ x k | ( a T ) 1 f | 2 + a 2 i ^ T a 1 k T 2 V x i ^ x k ( a T ) 1 f ( a T ) 2 f + i ^ , k = 1 3 a 1 i ^ T a 2 k T 2 V x i ^ x k ( a T ) 2 f ( a T ) 1 f + a 2 i ^ T a 2 k T 2 V x i ^ x k | ( a T ) 2 f | 2 = 2 V θ θ | ( a T ) 1 f | 2 + 2 ( e β θ 2 V θ x + 2 V θ y ) ( a T ) 1 f ( a T ) 2 f + i ^ , k = 1 3 a 2 i ^ T a 2 k T p a 2 V x i ^ x k ) | ( a T ) 2 f | 2 ; J 4 = i ^ , k ^ , k = 1 3 a 1 k ^ T a 1 k T a 1 i ^ T x k ^ V x k f x i ^ ( a T ) 1 f + a 1 k ^ T a 1 k T a 2 i ^ T x k ^ V x k f x i ^ ( a T ) 2 f i ^ , k ^ , k = 1 3 a 2 k ^ T a 2 k T a 1 i ^ T x k ^ V x k f x i ^ ( a T ) 1 f + a 2 k ^ T a 2 k T a 2 i ^ T x k ^ V x k f x i ^ ( a T ) 2 f = β e β θ ( a T ) 1 V f x ( a T ) 2 f .
Combining the above computations, we obtain the tensor R a b . Now, we turn to the second tensor R z b , which has the following form:
R z b ( f , f ) = i = 1 2 i , i ^ , k ^ = 1 3 a i i T ( a i i ^ T x i z 1 k ^ T x i ^ f x k ^ ) , ( z T ) 1 f R + i = 1 2 i , i ^ , k ^ = 1 3 a i i T a i i ^ T ( x i z 1 k ^ T x i ^ ) ( f x k ^ ) , ( z T ) 1 f R i = 1 2 i , i ^ , k ^ = 1 3 z 1 k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) , ( z T ) 1 f R i = 1 2 i , i ^ , k ^ = 1 3 z 1 k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ , ( z T ) 1 f R i = 1 2 i ^ , k ^ = 1 3 ( z 1 i ^ T b k ^ x i ^ f x k ^ b k ^ z 1 i ^ T x k ^ f x i ^ ) , ( z T f ) 1 R , = I 1 z + I 2 z + I 3 z + I 4 z + R b z ( f , f ) .
where we denote further that
R b z ( f , f ) = i ^ , k ^ = 1 3 ( z 1 i ^ T b k ^ x i ^ f x k ^ b k ^ z i i ^ T x k ^ f x i ^ ) ( z T f ) 1 .
By taking b = a a T V , we further obtain that
R b z ( f , f ) = i ^ , k ^ = 1 3 z 1 i ^ T b k ^ x i ^ f x k ^ ( z T f ) 1 b k ^ z i i ^ T x k ^ f x i ^ ( z T f ) 1 = k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k ^ T x i ^ a k k T V x k f x k ^ ( z T ) 1 f + k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k T x i ^ a k k ^ T V x k f x k ^ ( z T ) 1 f
+ k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k ^ T a k k T 2 V x i ^ x k f x k ^ ( z T ) 1 f k = 1 2 i ^ , k ^ , k = 1 3 a k k ^ T a k k T z 1 i ^ T x k ^ V x k f x i ^ ( z T ) 1 f = J 1 z + J 2 z + J 3 z + J 4 z .
By direct computations, it is not hard to observe that
I 1 z = i = 1 2 i , i ^ , k ^ = 1 3 a i i T ( a i i ^ T x i z 1 k ^ T x i ^ f x k ^ ) , ( z T ) 1 f R = 0 ; I 2 z = i = 1 2 i , i ^ , k ^ = 1 3 a i i T a i i ^ T ( x i z 1 k ^ T x i ^ ) ( f x k ^ ) , ( z T ) 1 f R = i = 1 2 i , i ^ = 1 3 a i i T a i i ^ T 2 z 1 k ^ T x i x i ^ y f ( z T ) 1 f ; I 3 z = i = 1 2 i , i ^ , k ^ = 1 3 z 1 k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) , ( z T ) 1 f R = 0 ; I 4 z = i = 1 2 i , i ^ , k ^ = 1 3 z 1 k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ , ( z T ) 1 f R = 0 ,
and
J 1 z = k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k ^ T x i ^ a k k T V x k f x k ^ ( z T ) 1 f = 0 ; J 2 z = k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k T x i ^ a k k ^ T V x k f x k ^ ( z T ) 1 f = 0 ; J 4 z = k = 1 2 i ^ , k ^ , k = 1 3 a k k ^ T a k k T z 1 i ^ T x k ^ V x k f x i ^ ( z T ) 1 f = k = 1 2 ( a T ) k z 13 T ( a T ) k V y f ( z T ) 1 f ; J 3 z = k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k ^ T a k k T 2 V x i ^ x k f x k ^ ( z T ) 1 f = i ^ , k ^ , k = 1 3 z 1 i ^ T a 1 k ^ T a 1 k T 2 V x i ^ x k f x k ^ ( z T ) 1 f + z 1 i ^ T a 2 k ^ T a 2 k T 2 V x i ^ x k f x k ^ ( z T ) 1 f = i ^ , k = 1 3 z 1 i ^ T a 1 k T 2 V x i ^ x k ( a T ) 1 f ( z T ) 1 f + z 1 i ^ T a 2 k T 2 V x i ^ x k ( a T ) 2 f ( z T ) 1 f = g 2 V θ y ( a T ) 1 f ( z T ) 1 f g ( e β θ 2 V x y + 2 V y y ) ( a T ) 2 f ( z T ) 1 f .
Now, we are left to compute the term R π . Recall that
R π ( f , f ) = 2 k = 1 1 i = 1 2 k , k ^ , i ^ , i = 1 3 x k z k k T z k k ^ T x k ^ a i i ^ T f x i ^ a i i T f x i + 2 k = 1 1 i = 1 2 k , k ^ , i ^ , i = 1 3 z k k T x k z k k ^ T x k ^ a i i ^ T f x i ^ a i i T f x i + z k k T z k k ^ T 2 x k x k ^ a i i ^ T f x i ^ a i i T f x i + z k k T z k k ^ T x k ^ a i i ^ T f x i ^ x k a i i T f x i . + 2 k = 1 1 i = 1 2 k ^ , i ^ , i = 1 3 ( z T log π ) k z k k ^ T x k ^ a i i ^ T f x i ^ a i i T f x i 2 j = 1 1 l = 1 2 l , l ^ , j ^ , j = 1 3 x l a l l T a l l ^ T x l ^ z j j ^ T f x j ^ z j j T f x j 2 j = 1 1 l = 1 2 l , l ^ , j ^ , j = 1 3 a l l T x l a l l ^ T x l ^ z j j ^ T f x j ^ z j j T f x j + a l l T a l l ^ T 2 x l x l ^ z j j ^ T f x j ^ z j j T f x j + a l l T a l l ^ T x l ^ z j j ^ T f x j ^ x l z j j T f x j 2 j = 1 1 l = 1 2 l ^ , j ^ , j = 1 3 ( a T log π ) l a l l ^ T x l ^ z j j ^ T f x j ^ z j j T f x j = i = 1 10 K i .
By direct computation, we obtain
K 1 = 0 , K 2 = 0 , K 3 = 0 , K 4 = 0 , K 5 = 0 , K 6 = 0 , K 7 = 0 ; K 8 = 2 l = 1 2 l , l ^ = 1 3 a l l T a l l ^ T 2 z 13 T x l x l ^ y f ( z T ) 1 f ; K 9 = 2 l = 1 2 l , l ^ = 1 3 a l l T a l l ^ T z 13 T x l ^ z 13 T x l | y f | 2 = 2 Γ 1 ( log g , log g ) | ( z T ) 1 f | 2 ; K 10 = 2 l = 1 2 l ^ = 1 3 ( a T ) l log π a l l ^ T z 13 T x l ^ y f ( z T ) 1 f = 2 Γ 1 ( log π , log g ) | ( z T ) 1 f | 2 .

3.3. Martinet Flat Sub-Riemannian Structure

In this part, we apply our result to the Martinet flat sub-Riemannian structure, which satisfies the bracket-generating condition and has a non-equiregular sub-Riemannian structure (see [37]). The sub-Riemannian structure is defined on R 3 through the kernel of one-form η : = d z 1 2 y 2 d x . A global orthonormal basis for the horizontal distribution H adapts the following differential operator representation, in local coordinates ( x , y , z ) :
X = x + y 2 2 z , Y = y .
The commutative relation gives
[ X , Y ] = y Z , [ Y , [ X , Y ] ] = Z , where Z = z .
To apply it in our framework, we take
a = 1 0 0 1 y 2 2 0 , a T = 1 0 y 2 2 0 1 0 , z T = ( 0 , 0 , 1 ) , a a T = 1 0 y 2 2 0 1 0 y 2 2 0 y 4 4 .
Thus, the sub-Riemannian structure has the form ( M , H , ( a a T ) | H ) .
Proposition 5.
In this setting,
π = e y 2 2 V ,
then
a a T log π = a a + a a T V .
Proof. 
The poof follows from the observation that
a a = 0 y 0 T , a a T log e y 2 2 = 0 y 0 T .
Similar to the previous displacement group case, we have the following identity.
Proposition 6.
For any smooth function f C ( M ) , one has
Γ 2 ( f , f ) + Γ 2 z , π ( f , f ) = Hess a , z f 2 + R ( f , f ) ,
where
Λ 1 T = ( 0 , y z f / 2 , 0 , y z f / 2 , 0 , 0 , 0 , 0 , 0 ) ; Λ 2 T = ( 0 , 0 , 0 , 0 , 0 , 0 , y y f , y 3 2 z f + y x f , 0 ) ; R a b ( f , f ) Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E = y 2 2 Γ 1 z ( f , f ) y 2 Γ 1 ( f , f ) + f z ( a T ) 1 f + y ( a T ) 1 V f z ( a T ) 2 f + y V z ( a T ) 1 f ( a T ) 2 f + i ^ , k = 1 3 a 1 i ^ T a 1 k T 2 V x i ^ x k | ( a T ) 1 f | 2 + 2 ( 2 V x y + y 2 2 2 V y z ) ( a T ) 1 f ( a T ) 2 f
+ 2 V y y | ( a T ) 2 f | 2 y V y f z ( a T ) 1 f ; R z b ( f , f ) = ( 2 V x z + y 2 2 2 V z z ) ( a T ) 1 f ( z T ) 1 f + 2 V y z ( a T ) 2 f ( z T ) 1 f ; R π ( f , f ) = 0 .
In particular, we have
i ^ , k = 1 3 a 1 i ^ T a 1 k T 2 V x i ^ x k | ( a T ) 1 f | 2 = 2 V x x + y 2 2 V x z + y 4 4 2 V z z | ( a T ) 1 f | 2 .
The proof of Proposition 6 follows from the proof of Theorem 1 (i.e., Theorem 3) and Lemmas 7–9 below. The following convergence results are a direct consequence of Theorem 2.
Proposition 7.
If there exists κ > 0 as shown in Theorem 2, the exponential dissipation result in the L 1 distance holds:
| ρ ( t , x ) π ( x ) | d x = O ( e κ t ) .
Similarly, we summarize the sub-Riemannian Ricci tensor in terms of R as follows.
Corollary 3.
The matrix R associated with the Martinet sub-Riemannian structure has the following form:
R 11 = 2 V x x + y 2 2 V x z + y 4 4 2 V z z y 2 ; R 22 = 2 V y y y 2 ; R 33 = y 2 2 ; R 12 = R 21 = y 2 V z + ( 2 V x y + y 2 2 2 V y z ) ; R 13 = R 31 = 1 2 y 2 V y + 1 2 ( 2 V x z + y 2 2 2 V z z ) ; R 23 = R 32 = 1 2 y ( a T ) 1 V + 1 2 2 V y z .
Proof. 
The proof follows from the similar equivalent matrix formulation as shown in the proof of Corollary 1 and the explicit bilinear forms in Proposition 6. □
Next, we prove the following three key lemmas.
Lemma 7.
For Martinet sub-Riemannian structure ( M , H , ( a a T ) | H ) , we have
Q = 1 0 y 2 2 0 0 0 y 2 2 0 y 4 4 0 1 0 0 0 0 0 y 2 2 0 0 0 0 1 0 y 2 2 0 0 0 0 0 0 0 1 0 0 0 0 ; P = 0 0 0 0 0 0 1 0 y 2 / 2 0 0 0 0 0 0 0 1 0 ; C T = ( 0 , 0 , 0 , 0 , 0 , y 3 2 z f + y x f , y y f , 0 , y 3 2 y f ) ; D T = ( 0 , 0 , y z f , 0 ) , E T = ( 0 , 0 ) ; F T = G T = ( 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ) .
Proof. 
Plugging matrices a and z from (27) into Notation 1, we complete the proof. □
Lemma 8.
For the Martinet sub-Riemannian structure, F and G are zero vectors, and we have
[ Q X + D ] T [ Q X + D ] + [ P X + E ] T [ P X + E ] + 2 C T X = Hess a , z f 2 Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E .
In particular, we have
Hess a , z f 2 = [ X + Λ 1 ] T Q T Q [ X + Λ 1 ] + [ X + Λ 2 ] T P T P [ X + Λ 2 ] ; Λ 1 T = ( 0 , y z f / 2 , 0 , y z f / 2 , 0 , 0 , 0 , 0 , 0 ) ; Λ 2 T = ( 0 , 0 , 0 , 0 , 0 , 0 , y y f , y 3 2 z f + y x f , 0 ) ; Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E = y 2 2 Γ 1 z ( f , f ) y 2 Γ 1 ( f , f ) .
Lemma 9.
By routine computations, we obtain
R a b ( f , f ) = f z ( a T ) 1 f + y ( a T ) 1 V f z ( a T ) 2 f + y V z ( a T ) 1 f ( a T ) 2 f + i ^ , k = 1 3 a 1 i ^ T a 1 k T 2 V x i ^ x k | ( a T ) 1 f | 2 + 2 ( 2 V x y + y 2 2 2 V y z ) ( a T ) 1 f ( a T ) 2 f + 2 V y y | ( a T ) 2 f | 2 y V y f z ( a T ) 1 f ; R z b ( f , f ) = ( 2 V x z + y 2 2 2 V z z ) ( a T ) 1 f ( z T ) 1 f + 2 V y z ( a T ) 2 f ( z T ) 1 f ; R π ( f , f ) = 0 .
Proof of Lemma 8.
Since F and G are zero vectors, we have
2 C T X = 2 2 f y z ( y 3 2 z f + y x f ) 2 f x z ( y y f ) 2 f z z ( y 3 2 y f ) .
By routine computation, we observe that
[ Q X + D ] T [ Q X + D ] + [ P X + E ] T [ P X + E ] + 2 C T X = 2 f x x + y 2 2 2 f x z + y 2 2 2 f z x + y 4 4 2 f z z 2 + 2 f y x + y 2 2 2 f z y + y z f 2 + 2 f y x + y 2 2 2 f z y 2 + 2 f y y 2 + 2 f z x + y 2 2 2 f z z 2 + 2 f z y 2 + 2 2 f y z ( y 3 2 z f + y x f ) 2 2 f x z ( y y f ) 2 2 f z z ( y 3 2 y f )
= 2 f x x + y 2 2 2 f x z + y 2 2 2 f z x + y 4 4 2 f z z 2 + 2 f y x + y 2 2 2 f z y + y z f 2 + 2 f y x + y 2 2 2 f z y 2 + 2 f y y 2 + 2 f z x + y 2 2 2 f z z y y f 2 + 2 f z y + ( y 3 2 z f + y x f ) 2 y 2 | y f | 2 ( y 3 2 z f + y x f ) 2 = | Hess a , z f | 2 + y 2 2 Γ 1 z ( f , f ) y 2 Γ 1 ( f , f ) ,
where we use the fact
2 f y x + y 2 2 2 f z y + y z f 2 + 2 f y x + y 2 2 2 f z y 2 = 2 2 f y x + y 2 2 2 f z y + 1 2 y z f 2 + y 2 2 | z f | 2 .
The proof is thus completed. □
We are now left to compute the three tensor terms.
Proof of Lemma 9.
Similar to the proof of Lemma 6, we have
R a ( f , f ) = i , k = 1 2 i , i ^ , k ^ = 1 3 a i i T ( a i i ^ T x i a k k ^ T x i ^ f x k ^ ) , ( a T ) k f R 2 + i , k = 2 n i , i ^ , k ^ = 1 3 a i i T a i i ^ T ( x i a k k ^ T x i ^ ) ( f x k ^ ) , ( a T ) k f R 2 i , k = 1 2 i , i ^ , k ^ = 1 3 a k k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) , ( a T ) k f R 2 i , k = 1 2 i , i ^ , k ^ = 1 3 a k k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ , ( a T ) k f R 2 , = I 1 + I 2 + I 3 + I 4 .
By direct computations, we have
I 1 = i = 1 2 i , i ^ , k ^ = 1 3 a i i T ( a i i ^ T x i a 1 k ^ T x i ^ f x k ^ ) ( a T ) 1 f + a i i T ( a i i ^ T x i a 2 k ^ T x i ^ f x k ^ ) ( a T ) 2 f = 0 ; I 2 = i = 2 n i , i ^ , k ^ = 1 3 a i i T a i i ^ T ( x i a 1 k ^ T x i ^ ) ( f x k ^ ) ( a T ) 1 f + a i i T a i i ^ T ( x i a 2 k ^ T x i ^ ) ( f x k ^ ) ( a T ) 2 f = a 22 T a 22 T 2 y y a 13 T f z ( a T ) 1 f = f z ( a T ) 1 f ; I 3 = i = 1 2 i , i ^ , k ^ = 1 3 a 1 k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) ( a T ) 1 f + a 2 k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) ( a T ) 2 f = 0 ; I 4 = i = 1 2 i , i ^ , k ^ = 1 3 a 1 k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ ( a T ) 1 f + a 2 k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ ( a T ) 2 f = 0 .
For the drift term, we take b = a a T V
R b a = i , k = 1 2 i ^ , k ^ , k = 1 3 a i i ^ T a k k ^ T x i ^ a k k T V x k f x k ^ ( a T ) i f + a i i ^ T a k k T x i ^ a k k ^ T V x k f x k ^ ( a T ) i f + i , k = 1 2 i ^ , k ^ , k = 1 3 a i i ^ T a k k ^ T a k k T 2 V x i ^ x k f x k ^ ( a T ) i f i , k = 1 2 i ^ , k ^ , k = 1 3 a k k ^ T a k k T a i i ^ T x k ^ V x k f x i ^ ( a T ) i f = J 1 + J 2 + J 3 + J 4 .
Plugging into the matrices of a T , we obtain
J 1 = i ^ , k ^ , k = 1 3 a 1 i ^ T a 1 k ^ T x i ^ a 1 k T V x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 1 k ^ T x i ^ a 1 k T V x k f x k ^ ( a T ) 2 f + i ^ , k ^ , k = 1 3 a 1 i ^ T a 2 k ^ T x i ^ a 2 k T V x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 2 k ^ T x i ^ a 2 k T V x k f x k ^ ( a T ) 2 f = a 22 T a 13 T y ( a T ) 1 V f z ( a T ) 2 f = y ( a T ) 1 V f z ( a T ) 2 f ;
J 2 = i ^ , k ^ , k = 1 3 a 1 i ^ T a 1 k T x i ^ a 1 k ^ T V x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 1 k T x i ^ a 1 k ^ T V x k f x k ^ ( a T ) 2 f + i ^ , k ^ , k = 1 3 a 1 i ^ T a 2 k T x i ^ a 2 k ^ T V x k f x k ^ ( a T ) 1 f + a 2 i ^ T a 2 k T x i ^ a 2 k ^ T V x k f x k ^ ( a T ) 2 f = y V z ( a T ) 1 f ( a T ) 2 f ;
J 3 = i ^ , k = 1 3 a 1 i ^ T a 1 k T 2 V x i ^ x k | ( a T ) 1 f | 2 + a 2 i ^ T a 1 k T 2 V x i ^ x k ( a T ) 1 f ( a T ) 2 f + i ^ , k = 1 3 a 1 i ^ T a 2 k T 2 V x i ^ x k ( a T ) 2 f ( a T ) 1 f + a 2 i ^ T a 2 k T 2 V x i ^ x k | ( a T ) 2 f | 2 = i ^ , k = 1 3 a 1 i ^ T a 1 k T 2 V x i ^ x k | ( a T ) 1 f | 2 + 2 ( 2 V x y + y 2 2 2 V y z ) ( a T ) 1 f ( a T ) 2 f + 2 V y y | ( a T ) 2 f | 2 ;
J 4 = i ^ , k ^ , k = 1 3 a 1 k ^ T a 1 k T a 1 i ^ T x k ^ V x k f x i ^ ( a T ) 1 f + a 1 k ^ T a 1 k T a 2 i ^ T x k ^ V x k f x i ^ ( a T ) 2 f
i ^ , k ^ , k = 1 3 a 2 k ^ T a 2 k T a 1 i ^ T x k ^ V x k f x i ^ ( a T ) 1 f + a 2 k ^ T a 2 k T a 2 i ^ T x k ^ V x k f x i ^ ( a T ) 2 f = y V y f z ( a T ) 1 f .
Combing the above computations, we obtain the tensor R a b . Now, we turn to the second tensor R z b . Since z T = ( 0 , 0 , 1 ) , it is obvious to see that only the drift term of the tensor R z b remains, where we denote
R b z ( f , f ) = i ^ , k ^ = 1 3 ( z 1 i ^ T b k ^ x i ^ f x k ^ b k ^ z 1 i ^ T x k ^ f x i ^ ) ( z T ) 1 f .
By taking b = a a T V , we further obtain that
R b z ( f , f ) = i ^ , k ^ = 1 3 z 1 i ^ T b k ^ x i ^ f x k ^ ( z T f ) 1 b k ^ z 1 i ^ T x k ^ f x i ^ ( z T f ) 1 = k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k ^ T x i ^ a k k T V x k f x k ^ ( z T ) 1 f + k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k T x i ^ a k k ^ T V x k f x k ^ ( z T ) 1 f + k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k ^ T a k k T 2 V x i ^ x k f x k ^ ( z T ) 1 f k = 1 2 i ^ , k ^ , k = 1 3 a k k ^ T a k k T z 1 i ^ T x k ^ V x k f x i ^ ( z T ) 1 f = J 1 z + J 2 z + J 3 z + J 4 z .
By direct computations, it is not hard to observe that
J 1 z = k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k ^ T x i ^ a k k T V x k f x k ^ ( z T ) 1 f = 0 ; J 2 z = k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k T x i ^ a k k ^ T V x k f x k ^ ( z T ) 1 f = 0 ; J 4 z = k = 1 2 i ^ , k ^ , k = 1 3 a k k ^ T a k k T z 1 i ^ T x k ^ V x k f x i ^ ( z T ) 1 f = 0 .
The only non-zero term has the following form:
J 3 z = k = 1 2 i ^ , k ^ , k = 1 3 z 1 i ^ T a k k ^ T a k k T 2 V x i ^ x k f x k ^ ( z T ) 1 f = i ^ , k ^ , k = 1 3 z 1 i ^ T a 1 k ^ T a 1 k T 2 V x i ^ x k f x k ^ ( z T ) 1 f + z 1 i ^ T a 2 k ^ T a 2 k T 2 V x i ^ x k f x k ^ ( z T ) 1 f = i ^ , k = 1 3 z 1 i ^ T a 1 k T 2 V x i ^ x k ( a T ) 1 f ( z T ) 1 f + z 1 i ^ T a 2 k T 2 V x i ^ x k ( a T ) 2 f ( z T ) 1 f = ( 2 V x z + y 2 2 2 V z z ) ( a T ) 1 f ( z T ) 1 f + 2 V y z ( a T ) 2 f ( z T ) 1 f .
Since matrix z T is a constant matrix and matrix a T contains only variable y, it is easy to observe that
R π ( f , f ) = 0 .

4. Lyapunov Analysis in Sub-Riemannian Density Manifold

In this section, we illustrate the motivation of this paper, which is to design a matrix condition, whose smallest eigenvalue characterizes the convergence rate of the degenerate SDE.
The outline of this section is given below. Consider a density space over the sub-Riemannian manifold. The finite-dimensional sub-Riemannian structure introduces the density space the infinite-dimensional sub-Riemannian structure. We name it the sub-Riemannian density manifold (SDM). We provide the geometric calculations in the SDM. We studied the Fokker–Planck equation as the sub-Riemannian gradient flow in the SDM. We derived the equivalence relation between the second-order calculus of the relative entropy in the SDM and the generalized Gamma z calculus.

4.1. Sub-Riemannian Density Manifold

Given a finite-dimensional sub-Riemannian manifold ( R n + m , τ , g τ ) with g τ = ( a a T ) , consider the probability density space:
P ( R n + m ) = ρ ( x ) C ( R n + m ) : ρ ( x ) d x = 1 , ρ ( x ) 0 .
Consider the tangent space at ρ P ( R n + m ) :
T ρ P ( R n + m ) = { σ ( x ) C ( R n + m ) : σ ( x ) d x = 0 } .
We introduce the sub-Riemannian structure in probability density space P ( R n + m ) .
Definition 3
(sub-Riemannian Wasserstein metric tensor). The L 2 sub-Riemannian-Wasserstein metric g ρ W a : T ρ P ( R n + m ) × T ρ P ( R n + m ) R is defined by
g ρ W a ( σ 1 , σ 2 ) = σ 1 ( x ) , ( Δ ρ a ) σ 2 ( x ) d x .
Here, σ 1 , σ 2 T ρ P ( R n + m ) , ( · , · ) is the metric on R n + m , and ( Δ ρ a ) : T ρ P ( R n + m ) T ρ P ( R n + m ) is the pseudo-inverse of the sub-elliptic operator:
Δ ρ a = · ( ρ a a T ) .
For some special choices of a as studied in [19] or a a T forming a positive definite matrix, then Δ ρ a is an elliptic operator. In this case, ( P ( R n + m ) , g W a ) still forms a Riemannian density manifold. In general, given a sub-Riemannian manifold ( R n + m , ( a a T ) ) , Δ ρ a is only a sub-elliptic operator. Thus, ( P ( R n + m ) , g W a ) forms an infinite-dimensional sub-Riemannian manifold.
We next present the sub-Riemannian calculus in ( P ( R n + m ) , g W a ) , including both geodesics and the Hessian operator in the tangent bundle. Consider an identification map:
V : C ( R n + m ) T ρ P ( R n + m ) , V Φ = Δ ρ a Φ = · ( ρ a a T Φ ) .
Here, Φ T π P ( R n + m ) = C ( R n + m ) / . This T π P ( R n + m ) is the cotangent space in the SDM, and ∼ represents a constant shift relation. Thus,
g ρ W a ( V Φ 1 , V Φ 2 ) = Γ 1 ( Φ 1 , Φ 2 ) ρ ( x ) d x .
In other words,
g ρ W a ( V Φ 1 , V Φ 2 ) = V Φ 1 ( Δ ρ a ) V Φ 2 d x = Φ 1 ( Δ ρ a ) ( Δ ρ a ) ( Δ ρ a ) Φ 2 d x = ( Φ 1 , Δ ρ a Φ 2 ) d x = Φ 1 ( · ( ρ a a T Φ 2 ) d x = ( a T Φ 1 , a T Φ 2 ) ρ d x ,
where the second equality holds by ( Δ ρ a ) ( Δ ρ a ) ( Δ ρ a ) = Δ ρ a and the last equality holds by the integration by parts.
We next derive several basic geometric calculations in the SDM.
Proposition 8
(Geodesics in the SDM). The sub-Riemannian geodesics in the cotangent bundle forms
t ρ t + · ( ρ t a a T Φ t ) = 0 , t Φ t + 1 2 ( a T Φ t , a T Φ t ) = 0 .
Proof. 
We considered the Lagrangian formulation of geodesics in density. Here, the minimization of the geometric action functional forms
L ( ρ t , t ρ t ) = 0 1 1 2 ( t ρ t , ( Δ ρ t a ) t ρ t ) d x d t ,
where ρ t = ρ ( t , x ) is a density path with fixed boundary points ρ 0 , ρ 1 . Then, the Euler–Lagrange equation in density space forms
t δ t ρ t L ( ρ t , t ρ t ) = δ ρ t L ( ρ t , t ρ t ) ,
where δ t ρ t is the L 2 first variation with respect to t ρ t and δ ρ t is the L 2 first variation with respect to ρ t . Here,
t ( Δ ρ t a ) t ρ t = δ ρ 1 2 t ρ , ( Δ ρ t a ) t ρ t d x = 1 2 a T ( Δ ρ a ) t ρ t , a T ( Δ ρ a ) t ρ t ,
where the last equality uses the following fact:
t Δ ρ t a = Δ ρ t a · Δ t ρ t a · Δ ρ t a ,
Denote t ρ t = Δ ρ t a Φ t , then the Euler–Lagrange Equation (30) forms the sub-Riemannian geodesics flow (29). In other words,
t Φ + 1 2 ( a T Φ , a T Φ ) = 0 .
Proposition 9
(Gradient and Hessian operators in the SDM). Given a functional F : P ( R n + m ) R , the gradient operator of F in ( P , g W a ) satisfies
g r a d W a F ( ρ ) = · ( ρ a a T δ F ( ρ ) ) .
The Hessian operator of F in ( P , g W a ) satisfies
H e s s W a F ( V Φ , V Φ ) = ( a ( y ) T y ) ( a ( x ) T x ) δ 2 F ( ρ ) ( x , y ) a ( x ) T x Φ ( x ) , a ( y ) T y Φ ( y ) ρ ( x ) ρ ( y ) d x d y + Hess a δ F ( ρ ) ( Φ , Φ ) ρ d x ,
where
Hess a δ F ( ρ ) ( Φ , Φ ) = 1 2 2 Γ 1 ( Γ 1 ( δ F , Φ ) , Φ ) Γ 1 ( Γ 1 ( Φ , Φ ) , δ F ) .
Proof. 
We first derive the sub-Riemannian gradient operator. We recall the identification map by Δ ρ a Φ = · ( ρ a a T Φ ) . Hence, the gradient operator in the SDM satisfies
grad W a F ( ρ ) = ( Δ ρ a ) δ δ ρ ( x ) F ( ρ ) = Δ ρ a δ δ ρ ( x ) F ( ρ ) = · ( ρ a a T δ δ ρ ( x ) F ( ρ ) ) .
The Hessian operator in the SDM satisfies
Hess W a F ( ρ ) ( V Φ , V Φ ) = d 2 d t 2 F ( ρ t ) | t = 0 ,
where ( ρ t , Φ t ) satisfies the geodesics Equation (29) with ρ 0 = ρ , Φ 0 = Φ . Notice the fact that
d d t F ( ρ t ) | t = 0 = t ρ t δ F ( ρ ) d x | t = 0 = ( · ( ρ a a T Φ ) ) δ F ( ρ ) d x = ( a T δ F ( ρ ) , a T Φ ) ρ d x .
In addition,
d 2 d t 2 F ( ρ t ) | t = 0 = d d t ( a T δ F ( ρ t ) , a T Φ t ) ρ t d x | t = 0 = δ 2 F ( ρ ) ( x , y ) t ρ t ( x ) t ρ t ( y ) d x d y + ( a T δ F ( ρ t ) , a T t Φ t ) ρ t d x + ( a T δ F ( ρ t ) , a T t Φ t ) t ρ t d x | t = 0 = δ 2 F ( ρ ) ( x , y ) · ( ρ a a T Φ ) ( x ) · ( ρ a a T Φ ) ( y ) d x d y 1 2 ( a T δ F ( ρ ) , a T Γ 1 ( Φ , Φ ) ) ρ d x + Γ 1 ( Φ , δ F ( ρ ) ) · ( ρ a a T Φ ) d x = ( a ( y ) T y ) ( a ( x ) T x ) δ 2 F ( ρ ) ( x , y ) a ( x ) T x Φ ( x ) , a ( y ) T y Φ ( y ) ρ ( x ) ρ ( y ) d x d y + 1 2 2 Γ 1 ( Γ 1 ( δ F , Φ ) , Φ ) Γ 1 ( Γ 1 ( Φ , Φ ) , δ F ) ρ d x ,
where the last equality holds by the integration by parts formula. □
We next show the equivalence relation between the Hessian of the relative entropy in the SDM and the classical Gamma two operator. We first demonstrate the relation among L * , Δ a and the gradient operator of the entropy. In particular, we show that the Fokker–Planck equation is a sub-Riemannian gradient flow in the SDM. Denote the KL divergence as
D ( ρ ) = D KL ( ρ π ) = ρ ( x ) log ρ ( x ) π ( x ) d x .
Proposition 10
(Gradient flow). The negative gradient operator in ( P , g W a ) forms
grad W a D ( ρ ) = L * ρ = · ( ρ a a T log ρ π ) .
In addition, the sub-Riemannian gradient flow of D ( ρ ) in ( P , g W a ) forms the Fokker–Planck equation:
t ρ = · ( ρ a a T log ρ π ) .
Proof. 
We first derive the sub-Riemannian gradient operator of the entropy and relative entropy. Notice that
δ ρ ( x ) D ( ρ ) = log ρ ( x ) + 1 log π ( x ) .
Thus,
grad W a D ( ρ ) = · ( ρ a a T log ρ ) + · ( ρ a a T log π ) ,
where ρ log ρ = ρ ρ ρ = ρ . Following the gradient flow formulation:
ρ t t = grad W a D ( ρ t ) = L * ρ t ,
we finish the derivation of (33). □
We next demonstrate that the Hessian of the relative entropy (KL divergence) is equivalent to the classical Bakry–Émery calculus.
Proposition 11
(Hessian of entropy and Bakry–Émery calculus). Given Φ 1 , Φ 2 C ( R n + m ) , then
H e s s W a D ( ρ ) ( V Φ , V Φ ) = Γ 2 ( Φ , Φ ) ρ ( x ) d x .
Proof. 
We first derive the Hessian of D ( ρ ) in the SDM. Notice the fact that δ 2 D ( ρ ) ( x , y ) = 1 ρ δ x = y . For simplicity, we denote δ 2 D ( ρ ) = 1 ρ ( x ) . By using (31), we have
Hess W a D ( ρ ) ( V Φ , V Φ ) = δ 2 D ( ρ ) ( x ) · ( ρ a a T Φ ) 2 d x ( a ) 1 2 ( a T δ D ( ρ ) , a T Γ 1 ( Φ , Φ ) ) ρ d x ( b ) + Γ 1 ( Φ , δ D ( ρ ) ) · ( ρ a a T Φ ) d x . ( c )
We next rewrite (34) into the iterative Gamma calculus. We first show that
( a ) + ( c ) = δ 2 D ( ρ ) · ( ρ a a T Φ ) Γ 1 ( Φ , δ D ( ρ ) ) · ( ρ a a T Φ ) ρ d x = 1 ρ · ( ρ a a T Φ ) ( a a T log ρ π , Φ ) · ( ρ a a T Φ ) ρ d x = 1 ρ ( ρ , a a T Φ ) + · ( a a T Φ ) ( a a T log ρ π , Φ ) · ( ρ a a T Φ ) ρ d x = ( ( log ρ , a a T Φ ) + · ( a a T Φ ) ( log ρ , a a T Φ ) + ( a a T log π , Φ ) ) · ( ρ a a T Φ ) ρ d x = ( · ( a a T Φ ) + ( a a T log π , Φ ) · ( ρ a a T Φ ) ρ d x = L Φ · ( ρ a a T Φ ) d x = Γ 1 ( L Φ , Φ ) ρ d x ,
where the fourth equality uses the fact that ρ ρ = log ρ , while the last equality follows the integration by parts.
We secondly show that
( b ) = 1 2 ( a T δ D ( ρ ) , a T Γ 1 ( Φ , Φ ) ) ρ d x = 1 2 Γ 1 ( Φ , Φ ) ) · ( ρ a a T δ D ( ρ ) ) d x = 1 2 Γ 1 ( Φ , Φ ) ) L * ρ d x = 1 2 L Γ 1 ( Φ , Φ ) ) ρ d x ,
where the second equality applies the fact that L * ρ = · ( ρ a a T δ D ) , while the last inequality uses the dual-relation between Kolmogorov operators L and L * in L 2 ( ρ ) , i.e.,
f ( x ) L * ρ ( t , x ) d x = L f ( x ) ρ ( t , x ) d x , for any f C c ( R n + m ) .
Combining the equality of ( a ) , ( b ) , ( c ) , we prove the result. □
Remark 10.
We remark that the above formulations in terms of a a T hold for both Riemannian and sub-Riemannian density manifolds. Here, the major difference is whether matrix function a is full rank or degenerate. In this sense, all formulas derived in this subsection recover the classical Bakry–Émery calculus. However, the classical Hessian operator of the entropy is not enough to study the convergence behavior of degenerate diffusion processes. Briefly, we use a modified Lyapunov functional and derive a tensor for the gradient flow in the SDM. It provides the convergence rate of the degenerate diffusion process.

4.2. Gamma z Calculus via Second-Order Calculus of Relative Entropy in SDM

In this subsection, we introduce the motivation of our new Gamma z calculus from the SDM viewpoint. Consider the SDM gradient flow (33):
t ρ t = Δ ρ t a δ D ( ρ t ) .
When a is a degenerate matrix, the classical relative Fisher information I a may not be the Lyapunov functional. In other words, along the gradient flow, it is possible that d d t I a ( ρ t ) 0 .
To handle this issue, a new Lyapunov function is considered. It is to add a new direction z into the relative Fisher information functional. Denote Δ ρ z = · ( ρ z z T ) and I z ( ρ ) = δ D , ( Δ ρ z ) δ D d x . Construct
I a , z ( ρ ) : = I a ( ρ ) + I z ( ρ ) = δ D , ( Δ ρ a Δ ρ z ) δ D d x .
We next prove the following proposition.
Proposition 12.
d d t I a , z ( ρ t ) = 2 Γ 2 ( δ D , δ D ) + Γ ˜ 2 z ( δ D , δ D ) ) ρ t d x ,
where
Γ ˜ 2 z ( Φ , Φ ) : = 1 2 L ( Γ 1 z ( Φ , Φ ) ) Γ 1 ( L z Φ , Φ ) ,
with the notation Δ z = · ( z z T ) and L z = · ( z z T ) + ( log π , z z T ) .
Proof. 
For the simplicity of notation, we denote ρ = ρ t . Notice the fact that
d d t I a , z ( ρ ) = d d t I a ( ρ ) + d d t I z ( ρ ) .
From Proposition 11, we have
d 2 d t 2 I a ( ρ ) = 2 Hess g a D ( V δ D , V δ D ) = 2 Γ 2 ( δ D , δ D ) ρ d x .
We only need to show the following claim.
Claim:
d d t I z ( ρ ) = 2 Γ ˜ 2 z ( δ D , δ D ) ρ d x .
Proof of Claim.
The proof is similar to the ones in Proposition 11. We need to take care of the z direction. Notice that
d d t I z ( ρ ) = 2 δ 2 D ( Δ ρ z δ D ) , t ρ d x + ( δ D , z z T δ D ) t ρ d x = 2 δ 2 D ( Δ ρ z δ D ) , Δ ρ a δ D d x + ( δ D , z z T δ D ) ( Δ ρ a δ D ) d x = 2 1 ρ · ( ρ a a T δ D ) · ( ρ z z T δ D ) d x ( I ) + ( δ D , z z T δ D ) · ( ρ a a T δ D ) d x ( I I )
We next estimate (I) and (II) separately. For (I), we notice the fact that
1 ρ · ( ρ z z T δ D ) = ( log ρ , z z T δ D ) + · ( z z T δ D ) = ( log ρ π , z z T δ D ) + ( log π , z z T δ D ) + · ( z z T δ D ) = ( δ D , z z T δ D ) + ( log π , z z T δ D ) + · ( z z T δ D ) .
Thus,
( I ) = 2 1 ρ · ( ρ a a T δ D ) · ( ρ z z T δ D ) d x = 2 · ( ρ a a T δ D ) ( ( δ D , z z T δ D ) + ( log π , z z T δ D ) + · ( z z T δ D ) ) d x = 2 ( δ D , z z T δ D ) L * ρ + · ( ρ a a T δ D ) ( log π , z z T δ D ) + · ( z z T δ D ) d x = 2 L ( δ D , z z T δ D ) ρ d x + 2 ( log π , z z T δ D ) + · ( z z T δ D ) , a a T δ D ρ d x ,
where the last equality holds by integration by parts.
For (II), we have
( I I ) = ( δ D , z z T δ D ) · ( ρ a a T δ D ) d x = ( δ D , z z T δ D ) L * ρ d x = L ( δ D , z z T δ D ) ρ d x .
Combining (I) and (II), we have
Γ ˜ 2 z ( Φ , Φ ) = 1 2 L ( Γ 1 z ( Φ , Φ ) ) Γ 1 ( Δ z Φ , Φ ) Γ 1 ( Γ 1 z ( log π , Φ ) , Φ ) .
Using the notation L z = Δ z + ( log π , z z T ) , we finish the proof. □
We next prove that Γ ˜ 2 z and Γ 2 z , π in Definition 1 agree with each other in the weak form along the gradient flow.
Proposition 13.
Denote Φ = δ D ( ρ ) , then
Γ ˜ 2 z ( Φ , Φ ) ρ d x = Γ 2 z , π ( Φ , Φ ) ρ d x .
Proof. 
To prove the proposition, we rewrite Γ ˜ 2 z as follows.
Γ ˜ 2 z ( Φ , Φ ) = 1 2 L ( Γ 1 z ( Φ , Φ ) ) Γ 1 ( L z Φ , Φ ) = 1 2 L ( Γ 1 z ( Φ , Φ ) ) Γ 1 z ( L Φ , Φ ) + Γ 1 z ( L Φ , Φ ) Γ 1 ( L z Φ , Φ ) .
Here, we need to prove the following equality.
Claim:
Γ 1 z ( L Φ , Φ ) Γ 1 ( L z Φ , Φ ) ρ d x = ρ 1 π · z z T π Φ , ( a a T ) Φ 1 π · a a T π Φ , ( z z T ) Φ d x .
Proof of Claim.
For the simplicity of notation, let
L * ρ = · ( a a T π ρ π ) = · ( ρ a a T log ρ π )
and
L z * ρ = · ( z z T π ρ π ) = · ( ρ z z T log ρ π ) .
The following property is also used in the proof. For any smooth test function f and Φ = log ρ π , then
L z * ρ f d x = Γ 1 z ( f , Φ ) ρ d x , L * ρ f d x = Γ 1 ( f , Φ ) ρ d x .
Notice that Φ = log ρ π , then
Γ 1 z ( L Φ , Φ ) ρ d x = ( ( · ( a a T Φ ) ( A , Φ ) ) , z z T Φ ) ρ d x = ( ( · ( a a T Φ ) ) , z z T Φ ) ρ d x ( ( A , Φ ) , z z T Φ ) ρ d x . ( a 1 ) ( a 2 )
Here,
( a 1 ) = ( · ( a a T Φ ) ) , z z T Φ ρ d x = · ( a a T Φ ) · ( ρ z z T Φ ) d x = · ( a a T log ρ π ) · ( ρ z z T log ρ π ) d x = · ( 1 ρ a a T π ρ π ) · ( ρ z z T log ρ π ) d x = ( 1 ρ , a a T π ρ π ) + 1 ρ · ( a a T π ρ π ) · ( ρ z z T log ρ π ) d x = ( 1 ρ , a a T π ρ π ) L z * ρ d x 1 ρ L * ρ L z * ρ d x = 1 ρ 2 ( ρ , a a T π ρ π ) L z * ρ d x 1 ρ L * ρ L z * ρ d x = ( log ρ , a a T log ρ π ) L z * ρ d x 1 ρ L * ρ L z * ρ d x = ( log ρ π , a a T log ρ π ) L z * ρ d x + ( log π , a a T log ρ π ) L z * ρ d x 1 ρ L * ρ L z * ρ d x = ( ( log ρ π , a a T log ρ π ) , z z T log ρ π ) ρ d x Γ 1 z ( ( log π , a a T log ρ π ) , log ρ π ) ρ d x 1 ρ L * ρ L z * ρ d x = ( log ρ π , ( a a T ) log ρ π , z z T ρ π ) π d x 2 2 log ρ π a a T log ρ π , z z T log ρ π ρ d x Γ 1 z ( ( log π , a a T log ρ π ) , log ρ π ) ρ d x 1 ρ L * ρ L z * ρ d x
= · ( z z T π ( log ρ π , ( a a T ) log ρ π ) 1 π ρ d x 2 2 log ρ π a a T log ρ π , z z T log ρ π ρ d x Γ 1 z ( ( log π , a a T log ρ π ) , log ρ π ) ρ d x 1 ρ L * ρ L z * ρ d x .
Notice the fact that
( a 2 ) = ( ( A , Φ ) , z z T Φ ) ρ d x = ( log π , a a T Φ ) , z z T Φ ρ d x = Γ 1 ( Γ 1 z ( Φ , Φ ) , Φ ) ρ d x .
Hence,
Γ 1 z ( L Φ , Φ ) ρ d x = ( a 1 ) + ( a 2 ) = · ( z z T π ( log ρ π , ( a a T ) log ρ π ) 1 π ρ d x 2 2 log ρ π a a T log ρ π , z z T log ρ π ρ d x 1 ρ L * ρ L z * ρ d x .
Similarly, by switching a and z, we have
Γ 1 ( L z Φ , Φ ) ρ d x = · ( a a T π ( log ρ π , ( z z T ) log ρ π ) 1 π ρ d x 2 2 log ρ π a a T log ρ π , z z T log ρ π ρ d x 1 ρ L * ρ L z * ρ d x .
Combining the above derivation, we finish the proof. □
Remark 11.
From the proof, we can show the following identity: denote Φ = δ D , then
Γ 1 z ( L Φ , Φ ) Γ 1 ( L z Φ , Φ ) ρ d x = Γ 1 ( Γ 1 z ( Φ , Φ ) , Φ ) Γ 1 z ( Γ 1 ( Φ , Φ ) , Φ ) ρ d x .
Therefore, it is clear that, if the commutative assumption Γ 1 ( Γ 1 z ( Φ , Φ ) , Φ ) = Γ 1 z ( Γ ( Φ , Φ ) , Φ ) holds, the above quantity equals zero. In this case,
Γ 2 z , π ( Φ , Φ ) ρ d x = Γ 2 z ( Φ , Φ ) ρ d x .
This means that, under the commutative assumption, the generalized Gamma z calculus agrees with the classical one [2] in the weak sense.
With the generalized Gamma z calculus, we are ready to prove the convergence properties and functional inequalities for degenerate drift–diffusion processes.
Proposition 14.
Suppose Γ 2 + Γ 2 z , π κ ( Γ 1 + Γ 1 z ) with κ > 0 . Denote ρ t as the solution of the sub-Riemannian gradient flow (33), then
d d t I a ( ρ t ) + I z ( ρ t ) 2 κ I a ( ρ t ) + I z ( ρ t ) .
In addition, the z-log-Sobolev inequalities holds:
R n + m ρ log ρ π d x 1 2 κ I a , z ( ρ ) ,
for any smooth density function ρ.
Finally,
R n + m | ρ ( t , x ) π ( x ) | d x 2 D ( ρ 0 ) e κ t .
Proof. 
Here, the proof is very similar to the one in the previous section. Again, consider the sub-Riemannian gradient flow in the SDM.
t ρ t = grad W a D ( ρ t ) .
We know that the log-Sobolev inequality relates to the ratio of d d t D ( ρ t ) and d 2 d t 2 D ( ρ t ) . If we cannot estimate a ratio κ > 0 , then
d d t I a ( ρ t ) 2 κ I a ( ρ t ) .
We construct the other Lyapunov function:
I a , z ( ρ ) = I a ( ρ ) + I z ( ρ ) .
Thus, along the SDM gradient flow (33), we have
d d t I a , z ( ρ t ) = 2 Γ 2 ( δ D , δ D ) + Γ 2 z , π ( δ D , δ D ) ρ t d x .
If Γ 2 + Γ 2 z , π κ ( Γ 1 + Γ 1 z ) , then
d d t I a , z ( ρ t ) 2 κ I a , z ( ρ t ) .
The convergence result follows directly from Gronwall’s equality.
We next prove the z-log-Sobolev inequality. Since
d d t D ( ρ t ) = I a ( ρ t ) I a , z ( ρ t ) ,
then (36) implies the fact that, denoting ρ 0 = ρ , then
I a , z ( ρ ) = 0 d d t I a , z ( ρ t ) d t 2 κ 0 I a , z ( ρ t ) d t = 2 κ 0 I a ( ρ t ) + I z ( ρ t ) d t 2 κ 0 I a ( ρ t ) d t = 2 κ 0 ( d d t D ( ρ t ) ) d t = 2 κ D ( ρ ) .
Thus, I a , z ( ρ ) 2 κ D ( ρ ) . Hence, we prove all the results by the fact that R κ ( Γ 1 + Γ 1 z ) implies Γ 2 + Γ 2 z , π κ ( Γ 1 + Γ 1 z ) . In other words, the generalized Gamma z calculus implies the z-log-Sobolev equality (zLSI):
R κ ( Γ 1 + Γ 1 z ) d d t I a , z ( ρ t ) 2 κ I a , z ( ρ t ) zLSI .
We last prove the exponential convergence in the L 1 distance. Notice that
D KL ( ρ t π ) 1 2 λ I a , z ( ρ t π ) 1 2 λ e 2 λ t I a , z ( ρ 0 π ) .
We apply an inequality between the KL divergence and L 1 distance. In other words,
R n + m | ρ ( t , x ) π ( x ) | d x 2 D KL ( ρ π ) .
This finishes the proof. □
Remark 12.
It is worth mentioning that our derivation of the Gamma z calculus is not a direct Hessian operator of the entropy in the SDM. In fact, it combines both the second-order calculus in the SDM and the property of the L 2 Hessian operator of the entropy. See similar relations in the mean-field Bakry–Émery calculus [50].

5. Generalized Gamma z Calculus

In this section, we introduce the generalized Gamma z calculus. For any smooth functions f , g : R n + m R , the diffusion operator associated with SDE (2) is denoted as
L f = Δ a f A f + b f ,
where we denote A = a a and
Δ a f = · ( a a T f ) .
When b = 0 , we denote the diffusion operator as
L ˜ f = Δ a f a a f .
We first define the Carré de Champ operator Γ 1 associated with the above second-order diffusion operators. It is easy to check that Δ a , L ˜ , and L share the same Γ 1 :
Γ 1 ( f , g ) = a T f , a T f R n .
Similarly, we introduce the Γ 1 z operator in the direction of z = ( z 1 , , z m ) below:
Γ 1 z = z T f , z T f R m ,
Next, we define the iterative Γ 2 and Γ 2 z for operator L ( L ˜ , respectively) below:
Γ 2 , L ( f , f ) = 1 2 L Γ 1 z ( f , f ) Γ 1 z ( L f , f ) .
Γ 2 , L z ( f , f ) = 1 2 L Γ 1 z ( f , f ) Γ 1 z ( L f , f ) .
Definition 4.
We define the generalized Gamma z for operator L below:
Γ 2 , L z , π ( f , f ) = Γ 2 , L z ( f , f ) + div z π ( Γ ( a a T ) f , f ) div a π ( Γ ( z z T ) f , f ) ,
For matrices a R n × ( n + m ) and z R m × ( n + m ) , we denote the divergence operator as
div z π ( Γ ( a a T ) f , f ) = · ( z z T π Γ ( a a T ) ( f , f ) ) π , div a π ( Γ ( z z T ) f , f ) = · ( a a T π Γ ( z z T ) ( f , f ) ) π ,
and
Γ ( a a T ) ( f , f ) = f , ( a a T ) f , and Γ ( z z T ) ( f , f ) = f , ( z z T ) f .
Here, we denote π as the invariant distribution associated with the operator L.
Remark 13.
In particular, we have the following local coordinates representation.
f , ( a a T ) f = f , x k ^ ( a a T ) f k ^ = 1 n + m = 2 a T f , x k ^ a T f k ^ = 1 n + m = 2 i = 1 n i ^ , i = 1 n + m x k ^ a i i ^ T f x i ^ a i i T f x i k ^ = 1 n + m , f , ( z z T ) f = 2 j = 1 n j ^ , j = 1 n + m x k ^ z j j ^ T f x j ^ z j j T f x j k ^ = 1 n + m .
We first present the following key lemmas.
Lemma 10.
div z π ( Γ ( a a T ) f , f ) div a π ( Γ ( z z T ) f , f ) = R π ( f , f ) + 2 G T X ,
where X, G are defined in Notation 1 and R π is defined in Definition 2.
Lemma 11.
Γ 2 , L ˜ ( f , f ) = X T Q T Q X + 2 D T Q X + 2 C T X + D T D + R a ( f , f ) ,
where Q , X , C , D are introduced in Notation 1 and R a is defined in Definition 2.
Lemma 12.
Γ 2 , L ˜ z ( f , f ) = X T P T P X + 2 E T P X + 2 F T X + E T E + R z ( f , f ) .
where P , X , F , E are introduced in Notation 1 and R z is defined in Definition 2.
We then have the following main theorem. In order to distinguish the operators L and L ˜ , we rewrite Theorem 1 as below, and with some abuse of notation, we denote Γ 2 ( f , f ) = Γ 2 , L ( f , f ) and Γ 2 z , π ( f , f ) = Γ 2 , L z , π ( f , f ) .
Theorem 3
(z-Bochner’s formula). For smooth function f : R n + m R , assume that Assumption 1 holds, then
Γ 2 , L ( f , f ) + Γ 2 , L z , π ( f , f ) = Hess a , z f 2 + R ( f , f ) ,
where
Hess a , z f 2 = [ X + Λ 1 ] T Q T Q [ X + Λ 1 ] + [ X + Λ 2 ] T P T P [ X + Λ 2 ] R ( f , f ) = Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E + R a b ( f , f ) + R z b ( f , f ) + R π ( f , f ) .
All the terms are defined in Notation 1 and Definition 2.
Proof. 
By Definition 4 and Formulae (39) and (40), we have
Γ 2 , L ( f , f ) + Γ ˜ 2 , L z , π ( f , f ) = Γ 2 , L ( f , f ) + Γ 2 , L z ( f , f ) + div z π ( Γ ( a a T ) f , f ) div a π ( Γ ( z z T ) f , f ) .
We compute the above terms explicitly in the following four steps.
Step 1:
Γ 2 , L ( f , f ) = 1 2 ( L Γ 1 , L ( f , f ) 2 Γ 1 , L ( L f , f ) ) = 1 2 Δ a Γ 1 ( f , f ) 1 2 A Γ 1 ( f , f ) + 1 2 b Γ 1 ( f , f ) Γ 1 ( ( Δ a A + b ) f , f ) = Γ 2 , L ˜ ( f , f ) + [ 1 2 b Γ 1 ( f , f ) Γ 1 ( b f , f ) ] .
The term Γ 2 , L ˜ ( f , f ) follows from Lemma 11. We are left with the other two terms:
1 2 b Γ 1 ( f , f ) = 1 2 k ^ = 1 n + m b k ^ x k ^ ( a T f , a T f R n ) = k ^ , i ^ = 1 n + m i = 1 n ( b k ^ a i i ^ T x k ^ f x i ^ + b k ^ a i i ^ T 2 f x k ^ x i ^ ) ( a T f ) i ,
and
Γ 1 ( b f , f ) = a T ( b f ) , a T f R n = i = 1 n k ^ , i ^ = 1 n + m ( a i i ^ T b k ^ x i ^ f x k ^ + a i i ^ T b k ^ 2 f x i ^ x k ^ ) ( a T f ) i .
Step 2:
Γ 2 , L z ( f , f ) = 1 2 ( L Γ 1 z ( f , f ) 2 Γ 1 z ( L f , f ) ) = 1 2 Δ a Γ 1 z ( f , f ) 1 2 A Γ 1 z ( f , f ) + 1 2 b Γ 1 z ( f , f ) Γ 1 z ( ( Δ a A + b ) f , f ) = Γ 2 , L ˜ z ( f , f ) + [ 1 2 b Γ 1 z ( f , f ) Γ 1 z ( b f , f ) ] .
The term Γ 2 , L ˜ z ( f , f ) follows from Lemma 12. We are left to compute the last two terms:
1 2 b Γ 1 z ( f , f ) = 1 2 k ^ = 1 n + m b k ^ x k ^ ( z T f , z T f R m ) = k ^ , i ^ = 1 n + m i = 1 m ( b k ^ z i i ^ T x k ^ f x i ^ + b k ^ z i i ^ T 2 f x k ^ x i ^ ) ( z T f ) i ,
and
Γ 1 z ( b f , f ) = z T ( b f ) , z T f R n = i = 1 m k ^ , i ^ = 1 n + m ( z i i ^ T b k ^ x i ^ f x k ^ + z i i ^ T b k ^ 2 f x i ^ x k ^ ) ( z T f ) i .
Step 3: Following Lemma 10, which will be proven shortly in the next section, we have
div z π ( Γ ( a a T ) f , f ) div a π ( Γ ( z z T ) f , f ) = R π ( f , f ) + 2 G T X ,
where X, G are defined in Notation 1 and R π is defined in Definition 2.
Step 4: Combining the above terms Γ 2 , L ˜ ( f , f ) in Lemma 11, Γ 2 , L ˜ z ( f , f ) in Lemma 12, and R π ( f , f ) + 2 G T X , we have
Γ 2 , L ˜ ( f , f ) + Γ 2 , L ˜ z ( f , f ) + R π ( f , f ) + 2 G T X = X T P T P X + 2 E T P X + 2 F T X + E T E + X T Q T Q X + 2 D T Q X + 2 C T X + D T D + 2 G T X + R a ( f , f ) + R z ( f , f ) + R π ( f , f ) = X T [ P T P + Q T Q ] X + 2 [ G T + F T + C T ] X + 2 [ E T P + D T Q ] X + D T D + E T E + R a ( f , f ) + R z ( f , f ) + R π ( f , f ) .
Assuming that Assumption 1 is satisfied, we obtain
Γ 2 , L ˜ ( f , f ) + Γ 2 , L ˜ z ( f , f ) + R π ( f , f ) + 2 G T X = [ X + Λ 1 ] T Q T Q [ X + Λ 1 ] + [ X + Λ 2 ] T P T P [ X + Λ 2 ] + R a ( f , f ) + R z ( f , f ) + R π ( f , f ) Λ 1 T Q T Q Λ 1 Λ 2 T P T P Λ 2 + D T D + E T E .
Adding the drift terms from Step 1 and Step 2, we obtain R a b and R z b , which finishes the proof. □

5.1. Proof of Lemma 10

Lemma 13.
div z π ( Γ ( a a T ) f , f ) div a π ( Γ ( z z T ) f , f ) = R π ( f , f ) + 2 G T X ,
where X, G are defined in Notation 1 and R π is defined in Definition 2.
Proof. 
For the first term in the above lemma, we have
div z π ( Γ ( a a T ) f , f ) = · ( z z T π Γ ( a a T ) ( f , f ) ) π = k = 1 n + m 1 π x k k = 1 m z k k π k ^ = 1 n + m z k k ^ T Γ ( a a T ) ( f , f ) ) k ^ = k = 1 n + m k = 1 m x k z k k k ^ = 1 n + m z k k ^ T Γ ( a a T ) ( f , f ) ) k ^ + z k k x k k ^ = 1 n + m z k k ^ T Γ ( a a T ) ( f , f ) ) k ^
+ k = 1 n + m k = 1 m x k log π z k k k ^ = 1 n + m z k k ^ T Γ ( a a T ) ( f , f ) ) k ^ = k = 1 n + m k = 1 m x k z k k T k ^ = 1 n + m z k k ^ T Γ ( a a T ) ( f , f ) ) k ^ + z k k T x k k ^ = 1 n + m z k k ^ T Γ ( a a T ) ( f , f ) ) k ^ + k = 1 m ( z T log π ) k k ^ = 1 n + m z k k ^ T Γ ( a a T ) ( f , f ) ) k ^ ,
where Γ ( a a T ) ( f , f ) ) k ^ is defined in (42). Plugging in (42), we further obtain
div z π ( Γ ( a a T ) f , f ) = k = 1 n + m k = 1 m x k z k k T k ^ = 1 n + m z k k ^ T 2 i = 1 n i ^ , i = 1 n + m x k ^ a i i ^ T f x i ^ a i i T f x i + k = 1 n + m k = 1 m z k k T x k k ^ = 1 n + m z k k ^ T 2 i = 1 n i ^ , i = 1 n + m x k ^ a i i ^ T f x i ^ a i i T f x i + k = 1 m ( z T log π ) k k ^ = 1 n + m z k k ^ T 2 i = 1 n i ^ , i = 1 n + m x k ^ a i i ^ T f x i ^ a i i T f x i = 2 k = 1 m i = 1 n k , k ^ , i ^ , i = 1 n + m x k z k k T z k k ^ T x k ^ a i i ^ T f x i ^ a i i T f x i S 1 z + 2 k = 1 m i = 1 n k , k ^ , i ^ , i = 1 n + m z k k T x k z k k ^ T x k ^ a i i ^ T f x i ^ a i i T f x i S 2 z + 2 k = 1 m i = 1 n k ^ , i ^ , i = 1 n + m ( z T log π ) k z k k ^ T x k ^ a i i ^ T f x i ^ a i i T f x i S 3 z = S 1 z + S 2 z + S 3 z .
By further expanding S 2 z , we obtain
S 2 z = 2 k = 1 m i = 1 n k , k ^ , i ^ , i = 1 n + m z k k T x k z k k ^ T x k ^ a i i ^ T f x i ^ a i i T f x i = 2 k = 1 m i = 1 n k , k ^ , i ^ , i = 1 n + m z k k T x k z k k ^ T x k ^ a i i ^ T f x i ^ a i i T f x i + z k k T z k k ^ T 2 x k x k ^ a i i ^ T f x i ^ a i i T f x i + z k k T z k k ^ T x k ^ a i i ^ T 2 f x k x i ^ a i i T f x i + z k k T z k k ^ T x k ^ a i i ^ T f x i ^ a i i T 2 f x k x i + z k k T z k k ^ T x k ^ a i i ^ T f x i ^ x k a i i T f x i .
Similarly, we obtain
div a π ( Γ ( z z T ) f , f ) = l = 1 n + m l = 1 n x l a l l T l ^ = 1 n + m a l l ^ T Γ ( z z T ) ( f , f ) ) l ^ + a l l T x l l ^ = 1 n + m a l l ^ T Γ ( z z T ) ( f , f ) ) l ^ + l = 1 n ( a T log π ) l l ^ = 1 n + m a l l ^ T Γ ( z z T ) ( f , f ) ) l ^ = 2 j = 1 m l = 1 n l , l ^ , j ^ , j = 1 n + m x l a l l T a l l ^ T x l ^ z j j ^ T f x j ^ z j j T f x j S 1 a + 2 j = 1 m l = 1 n l , l ^ , j ^ , j = 1 n + m a l l T x l a l l ^ T x l ^ z j j ^ T f x j ^ z j j T f x j S 2 a + 2 j = 1 m l = 1 n l ^ , j ^ , j = 1 n + m ( a T log π ) l a l l ^ T x l ^ z j j ^ T f x j ^ z j j T f x j S 3 a
= S 1 a + S 2 a + S 3 a ,
where we also obtain
S 2 a = 2 j = 1 m l = 1 n l , l ^ , j ^ , j = 1 n + m a l l T x l a l l ^ T x l ^ z j j ^ T f x j ^ z j j T f x j = 2 j = 1 m l = 1 n l , l ^ , j ^ , j = 1 n + m a l l T x l a l l ^ T x l ^ z j j ^ T f x j ^ z j j T f x j + a l l T a l l ^ T 2 x l x l ^ z j j ^ T f x j ^ z j j T f x j + a l l T a l l ^ T x l ^ z j j ^ T 2 f x l x j ^ z j j T f x j + a l l T a l l ^ T x l ^ z j j ^ T f x j ^ z j j T 2 f x l x j + a l l T a l l ^ T x l ^ z j j ^ T f x j ^ x l z j j T f x j .
Combining all the terms above, we have
div z π ( Γ ( a a T ) f , f ) div a π ( Γ ( z z T ) f , f ) = S 1 z + S 2 z + S 3 z ( S 1 a + S 2 a + S 3 a ) .
By direct computations, we separate the above terms into two groups based on “ f f ” and “ 2 f f ”. We denote R π ( f , f ) as the sum of all “ f f ” terms and denote 2 G T X as the sum of all “ 2 f f ” terms. Switching indices for the terms in 2 G T X to match 2 f x i ^ x j ^ , we obtain the following:
2 G T X = 2 k = 1 m i = 1 n k , k ^ , i ^ , i = 1 n + m z k k T z k k ^ T x k ^ a i i ^ T 2 f x k x i ^ a i i T f x i + z k k T z k k ^ T x k ^ a i i ^ T f x i ^ a i i T 2 f x k x i 2 j = 1 m l = 1 n l , l ^ , j ^ , j = 1 n + m a l l T a l l ^ T x l ^ z j j ^ T 2 f x l x j ^ z j j T f x j + a l l T a l l ^ T x l ^ z j j ^ T f x j ^ z j j T 2 f x l x j = 2 j = 1 m i = 1 n j , j ^ , i ^ , i = 1 n + m z j j T z j j ^ T x j ^ a i i ^ T 2 f x j x i ^ a i i T f x i + z j j T z j j ^ T x j ^ a i i ^ T f x i ^ a i i T 2 f x j x i 2 j = 1 m i = 1 n i , i ^ , j ^ , j = 1 n + m a i i T a i i ^ T x i ^ z j j ^ T 2 f x i x j ^ z j j T f x j + a i i T a i i ^ T x i ^ z j j ^ T f x j ^ z j j T 2 f x i x j = 2 j = 1 m i = 1 n j , j ^ , i ^ , i = 1 n + m z j j ^ T z j j T x j a i i ^ T 2 f x j ^ x i ^ a i i T f x i + z j j ^ T z j j T x j a i i T f x i a i i ^ T 2 f x j ^ x i ^ 2 j = 1 m i = 1 n i , i ^ , j ^ , j = 1 n + m a i i ^ T a i i T x i z j j ^ T 2 f x i ^ x j ^ z j j T f x j + a i i ^ T a i i T x i z j j T f x j z j j ^ T 2 f x i ^ x j ^ = 2 i ^ , j ^ = 1 n + m 2 f x i ^ x j ^ i = 1 n j = 1 m j , j ^ , i , i ^ = 1 n + m z j j ^ T z j j T x j a i i ^ T a i i T f x i + z j j ^ T z j j T x j a i i T f x i a i i ^ T a i i ^ T a i i T x i z j j ^ T z j j T f x j + a i i ^ T a i i T x i z j j T f x j z j j ^ T .
The first equality follows from the quantities we obtained previously, the second equality from switching k ” to j ” and l ” to i ”, and the third equality from switching between i ” and i ^ ”, j ” and j ^ ”. Thus, the proof is completed. □

5.2. Proof of Lemma 11

From now on, we keep the following notation: a T f = i = 1 n i ^ = 1 n + m a i i ^ T x i ^ f . Furthermore, we fixed the notation for a , a T with relation a i ^ i = a i i ^ T for i = 1 , , n and i ^ = 1 , , n + m . Here, we denote a i i ^ T : = ( a T ) i i ^ . Recall that we define
Γ 2 , L ˜ ( f , f ) = 1 2 L ˜ Γ 1 ( f , f ) 2 Γ 1 ( L ˜ f , f ) .
Next, we are ready to prove the following lemma.
Lemma 14.
Γ 2 , L ˜ ( f , f ) = X T Q T Q X + 2 D T Q X + 2 C T X + D T D + R a ( f , f ) ,
where Q , X , C , D are introduced in Notation 1 and R a is defined in Definition 2.
Proof. 
We plug in the operator L ˜ into our definition for Γ 2 :
Γ 2 , L ˜ ( f , f ) = 1 2 Δ a Γ 1 ( f , f ) 1 2 A Γ 1 ( f , f ) Γ 1 ( ( Δ a A ) f , f ) = 1 2 Δ a Γ 1 ( f , f ) Γ 1 ( Δ a f , f ) 1 2 A Γ 1 ( f , f ) + Γ 1 ( A f , f ) .
Now, we compute the last two terms of the above equation. With A = a a , we obtain
1 2 A Γ 1 ( f , f ) = 1 2 k ^ = 1 n + m A k ^ x k ^ a T f , a T f R n = k ^ = 1 n + m A k ^ ( x k ^ a T ) f , a T f R n k ^ = 1 n + m A k ^ a T ( x k ^ f ) , a T f R n = J 1 + J 2 ,
and
Γ 1 ( A f , f ) = a T ( k ^ = 1 n + m A k ^ x k ^ f ) , a T f R n = a T ( k ^ = 1 n + m A k ^ x k ^ f ) , a T f R n + a T ( k ^ = 1 n + m A k ^ x k ^ f ) , a T f R n = J 3 + J 4 .
It is easy to see
J 2 + J 3 = 0 .
We now expand J 1 and J 4 into local coordinates:
J 1 = l = 1 n a T f l l , k ^ = 1 n + m k = 1 n k = 1 n + m a k ^ k x k a k k x k ^ a l l x l f ,
and
J 4 = l = 1 n ( a T f ) l l = 1 n + m a l l T ( k ^ = 1 n + m x l ( k = 1 n k = 1 n + m a k ^ k x k a k k ) x k ^ f ) = l = 1 n ( a T f ) l k = 1 n l = 1 n + m k ^ , k = 1 n + m a l l T x l a k ^ k x k a k k x k ^ f + l = 1 n ( a T f ) l k = 1 n l = 1 n + m k ^ , k = 1 n + m a l l T a k ^ k ( x l x k a k k ) x k ^ f .
Applying Lemma 15, which will be proven shortly below, we have
1 2 Δ a Γ 1 ( f , f ) Γ 1 ( Δ a f , f ) = 1 2 ( a T ( a T | a T f | 2 ) ) a T ( ( a T ) ( a T f ) ) , a T f R n + l = 1 n ( a T f ) l i ^ , k ^ , l = 1 n + m k = 1 n x i ^ a i ^ k a k k ^ T ( x k ^ a l l T x l f ) a l l T x i ^ a i ^ k ( x l a k k ^ T x k ^ f ) B n × n a T f , a T f R n ,
where
B n × n a T f , a T f R n = l = 1 n ( a T f ) l k = 1 n l = 1 n i = 1 n + m k , j = 1 n + m a l j T 2 x i x j a i k ( a k k T x k f ) .
Thus, combining with (47) and (48), we have
Γ 2 , L ˜ ( f , f ) = 1 2 Δ a Γ 1 ( f , f ) Γ 1 ( Δ a f , f ) + J 1 + J 4 = 1 2 ( a T ( a T | a T f | 2 ) ) a T [ ( a T ) ( a T f ) ] , a T f R n .
where the last term follows from Lemma 16 below. The proof is thus completed. □
Lemma 15.
1 2 Δ a Γ 1 ( f , f ) Γ 1 ( Δ a f , f ) = 1 2 ( a T ( a T | a T f | 2 ) ) a T ( [ ( a T ) ( a T f ) ] ) , a T f R n B n × n a T f , a T f R n + B 0 .
Here, the local representations for B n × n and B 0 are given as follows. For l , k = 1 , , n , we denote
B l k = j = 1 n + m a l j T i = 1 n + m 2 x i x j a i k = j = 1 n + m a l j T i = 1 n + m 2 x i x j a k i T , B 0 = l = 1 n ( a T f ) l i ^ , k ^ , l = 1 n + m k = 1 n x i ^ a i ^ k a k k ^ T ( x k ^ a l l T x l f ) a l l T x i ^ a i ^ k ( x l a k k ^ T x k ^ f ) .
We introduce the following notation (convention) that, for any function F,
( a T ) ( a T F ) = i = 1 n ( a T ) i ( a T F ) i = i = 1 n i ^ , i = 1 n + m ( a i i ^ T x i ^ ) ( a i i T F x i ) .
Proof of Lemma 15.
By our definition above, we have
Δ a Γ 1 ( f , f ) = · ( a a T a T f , a T f R n ) = · ( a F ) = i ^ = 1 n + m x i ^ ( k = 1 n a i ^ k F k ) = i ^ = 1 n + m k = 1 n ( x i ^ a i ^ k F k + a i ^ k x i ^ F k ) = i ^ = 1 n + m k = 1 n ( x i ^ a i ^ k F k ) + a T ( a T ( a T f ) 2 ) ,
where we denote
F = a T a T f , a T f R n = a T l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) 2 = k ^ = 1 n + m a k k ^ T x j ^ l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) 2 k = 1 , , n = ( F 1 , F 2 , , F n ) T .
Therefore, we have
Δ a Γ 1 ( f , f ) = i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T x j ^ l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) 2 + a T ( a T ( a T f ) 2 ) = k = 1 n i ^ = 1 n + m x i ^ a i ^ k k ^ = 1 n + m a k k ^ T x k ^ ( a T f ) 2 + ( a T ) ( a T ( a T f ) 2 ) = a ( a T ( a T f ) 2 ) + ( a T ) ( a T ( a T f ) 2 ) .
Next, we compute the following quantity.
Γ 1 ( Δ a f , f ) = a T ( · ( a a T f ) ) , a T f R n ,
where we have
· ( a a T f ) = · ( k = 1 n k ^ = 1 n + m a i ^ k a k k ^ T x k ^ f ) = i ^ = 1 n + m x i ^ ( k = 1 n k ^ = 1 n + m a i ^ k a k k ^ T x k ^ f )
= i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x k ^ f ) + i ^ , k ^ = 1 n + m k = 1 n a i ^ k x i ^ ( a k k ^ T x j f ) = i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x k ^ f ) + ( a T ) ( a T f ) = a ( a T f ) + ( a T ) ( a T f ) .
We continue with our computation as below:
Γ 1 ( Δ a f , f ) = a T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x k ^ f ) + ( a T ) ( a T f ) , a T f R n = a T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x k ^ f ) , a T f R n + a T ( ( a T ) ( a T f ) ) , a T f R n = a T a · ( a T f ) , a T f R n + a T ( ( a T ) ( a T f ) ) , a T f R n = a T a · ( a T f ) , a T f R n + a · ( a T ( a T f ) ) , a T f R n + a T ( ( a T ) ( a T f ) ) , a T f R n .
From the above, combining (52) and (53), we further obtain
1 2 Δ a Γ 1 ( f , f ) Γ 1 ( Δ a f , f ) = 1 2 ( a T ( a T | a T f | 2 ) ) a T ( ( a T ) ( a T f ) ) , a T f R n
+ 1 2 i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T x j ^ l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) 2 a T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x k ^ f ) , a T f R n = 1 2 ( a T ( a T | a T f | 2 ) ) a T ( ( a T ) ( a T f ) ) , a T f R n + i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) x k ^ ( l ^ = 1 n + m a l l ^ T x l ^ f ) I l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) l = 1 n + m a l l T x l i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x k ^ f ) II .
Recall that we denote a T to emphasize the transpose of the matrix a and a i i ^ T = a i ^ i :
I = i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) x k ^ ( l ^ = 1 n + m a l l ^ T x l ^ f ) = i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) ( l ^ = 1 n + m x k ^ a l l ^ T x l ^ f ) + i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) ( l ^ = 1 n + m a l l ^ T x k ^ x l ^ f ) = l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T ( l = 1 n + m x k ^ a l l T x l f ) + l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T ( l ^ = 1 n + m a l l T x k ^ x l f ) ,
and
II = l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) l = 1 n + m a l l T x l i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x k ^ f ) = l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) l = 1 n + m a l l T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k x l ( a k k ^ T x k ^ f ) + l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) l = 1 n + m a l l T i ^ , k ^ = 1 n + m k = 1 n 2 x i ^ x l a i ^ k ( a k k ^ T x k ^ f ) = l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) l = 1 n + m a l l T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( x l a k k ^ T x k ^ f ) + l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) l = 1 n + m a l l T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x l x k ^ f ) + l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) l = 1 n + m a l l T i ^ , k ^ = 1 n + m k = 1 n 2 x i ^ x l a i ^ k ( a k k ^ T x k ^ f ) .
Subtracting the above two terms, we have
I II = l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) l = 1 n + m a l l T i ^ , k ^ = 1 n + m k = 1 n 2 x i ^ x l a i ^ k ( a k k ^ T x k ^ f ) + l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T ( l = 1 n + m x k ^ a l l T x l f ) l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) l = 1 n + m a l l T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( x l a k k ^ T x k ^ f ) = l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) l = 1 n + m a l l T i ^ , k ^ = 1 n + m k = 1 n 2 x i ^ x l a i ^ k ( a k k ^ T x k ^ f ) + l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) i ^ , k ^ , l = 1 n + m k = 1 n x i ^ a i ^ k a k k ^ T ( x k ^ a l l T x l f ) a l l T x i ^ a i ^ k ( x l a k k ^ T x k ^ f ) = l = 1 n ( l ^ = 1 n + m a l l ^ T x l ^ f ) l = 1 n + m a l l T i ^ , k ^ = 1 n + m k = 1 n 2 x i ^ x l a i ^ k ( a k k ^ T x k ^ f ) + l = 1 n ( a T f ) l i ^ , k ^ , l = 1 n + m k = 1 n x i ^ a i ^ k a k k ^ T ( x k ^ a l l T x l f ) a l l T x i ^ a i ^ k ( x l a k k ^ T x k ^ f ) .
Now, we eventually obtain the the following step:
1 2 Δ a Γ 1 ( f , f ) Γ 1 ( Δ a f , f ) = 1 2 ( a T ( a T | a T f | 2 ) ) a T ( ( a T ) ( a T f ) ) , a T f R n k = 1 n l = 1 n i = 1 n + m j = 1 n + m a l j T 2 x i x j a i k ( a T f ) k , a T f + l = 1 n ( a T f ) l i ^ , k ^ , l = 1 n + m k = 1 n x i ^ a i ^ k a k k ^ T ( x k ^ a l l T x l f ) a l l T x i ^ a i ^ k ( x l a k k ^ T x k ^ f ) = 1 2 ( a T ( a T | a T f | 2 ) ) a T ( ( a T ) ( a T f ) ) , a T f R n B n × n a T f , a T f R n + l = 1 n ( a T f ) l i ^ , k ^ , l = 1 n + m k = 1 n x i ^ a i ^ k a k k ^ T ( x k ^ a l l T x l f ) a l l T x i ^ a i ^ k ( x l a k k ^ T x k ^ f ) .
Thus, the proof is completed. □
Below, we further investigate the extra term explicitly in the above Lemma 15.
Lemma 16.
1 2 ( a T ( a T | a T f | 2 ) ) a T ( ( a T ) ( a T f ) ) , a T f R n = X T Q T Q X + 2 D T Q X + 2 C T X + D T D + i , k = 1 n i , i ^ , k ^ = 1 n + m a i i T ( a i i ^ T x i a k k ^ T x i ^ f x k ^ ) , ( a T ) k f R n + i , k = 1 n i , i ^ , k ^ = 1 n + m a i i T a i i ^ T ( x i a k k ^ T x i ^ ) ( f x k ^ ) , ( a T ) k f R n
i , k = 1 n i , i ^ , k ^ = 1 n + m ( a k k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) , ( a T ) k f R n i , k = 1 n i , i ^ , k ^ = 1 n + m a k k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ , ( a T ) k f R n .
Recall that matrix Q and vectors X, C, and D are defined in Notation 1.
Proof. 
We expand the two terms in Lemma 16. The first term reads as
1 2 ( a T ( a T | a T f | 2 ) ) = 1 2 i = 1 n k = 1 n ( a T ) i ( a T ) i | ( a T ) k f | 2 = i = 1 n k = 1 n ( a T ) i ( a T ) i ( a T ) k f , ( a T ) k f R n = i = 1 n k = 1 n ( a T ) i ( a T ) k f , ( a T ) i ( a T ) k f R n T 1 + i = 1 n k = 1 n i , i ^ , k ^ = 1 n + m ( a i i T x i ) ( a i i ^ T x i ^ ) ( a k k ^ T x k ^ ) f , ( a T ) k f R n R 1 .
The second term reads as
a T ( [ ( a T ) ( a T f ) ] ) , a T f R n = i , k = 1 n ( a T ) k [ ( a T ) i ( a T ) i f ] , ( a T ) k f = i , k = 1 n i , i ^ , k ^ = 1 n + m ( a k k ^ T x k ^ ) [ ( a i i T x i ) ( a i i ^ T x i ^ ) f ] , ( a T ) k f R 2 .
Next, we expand R 1 and R 2 completely and obtain the following:
R 1 = i , k = 1 n i , i ^ , k ^ = 1 n + m ( a i i T x i ) ( a i i ^ T x i ^ ) ( a k k ^ T f x k ^ ) , ( a T ) k f R n = i , k = 1 n i , i ^ , k ^ = 1 n + m a i i T ( a i i ^ T x i a k k ^ T x i ^ f x k ^ ) , ( a T ) k f R n R 1 1 + i , k = 1 n i , i ^ , k ^ = 1 n + m a i i T a i i ^ T ( x i a k k ^ T x i ^ ) ( f x k ^ ) , ( a T ) k f R n R 1 2
+ i , k = 1 n i , i ^ , k ^ = 1 n + m a i i T a i i ^ T ( a k k ^ T x i ^ ) ( x i f x k ^ ) , ( a T ) k f R n R 1 3 + i , k = 1 n i , i ^ , k ^ = 1 n + m ( a i i T ) ( ( x i a i i ^ T ) a k k ^ T x i ^ f x k ^ ) , ( a T ) k f R n R 1 4 + i , k = 1 n i , i ^ , k ^ = 1 n + m a i i T a i i ^ T ( x i a k k ^ T ) x i ^ f x k ^ ) , ( a T ) k f R n R 1 5 + i , k = 1 n i , i ^ , k ^ = 1 n + m a i i T a i i ^ T a k k ^ T ( x i x i ^ f x k ^ ) , ( a T ) k f R n R 1 6 .
Additionally,
R 2 = i , k = 1 n i , i ^ , k ^ = 1 n + m ( a k k ^ T x k ^ ) [ ( a i i T x i ) ( a i i ^ T f x i ^ ) ] , ( a T ) k f = i , k = 1 n i , i ^ , k ^ = 1 n + m a k k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) , ( a T ) k f R 2 1 + i , k = 1 n i , i ^ , k ^ = 1 n + m a k k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ , ( a T ) k f R 2 2 + i , k = 1 n i , i ^ , k ^ = 1 n + m a k k ^ T a i i T a i i ^ T x i ( x k ^ f x i ^ ) , ( a T ) k f R 2 3 = R 1 4 + i , k = 1 n i , i ^ , k ^ = 1 n + m a k k ^ T a i i T x k ^ a i i ^ T ( x i f x i ^ ) , ( a T ) k f R 2 4 + i , k = 1 n i , i ^ , k ^ = 1 n + m a k k ^ T a i i T a i i ^ T x k ^ ( x i f x i ^ ) , ( a T ) k f R 2 5 + i , k = 1 n i , i ^ , k ^ = 1 n + m a k k ^ T a i i T a i i ^ T ( x k ^ x i f x i ^ ) , ( a T ) k f R 2 6 = R 1 6 .
Our next step is to complete the squares for all the above terms. Look at the term T 1 first.
T 1 = i , k = 1 n i ^ , k ^ = 1 n + m a i i ^ T a k k ^ T 2 f x i ^ x k ^ + i ^ , k ^ = 1 n + m a i i ^ T a k k ^ T x i ^ f x k ^ , i , k = 1 n + m a i i T a k k T 2 f x i x k + i , k = 1 n + m a i i T a k k T x i f x k = i , k = 1 n i ^ , k ^ = 1 n + m a i i ^ T a k k ^ T 2 f x i ^ x k ^ , i , k = 1 n + m a i i T a k k T 2 f x i x k T 1 a + i , k = 1 n i ^ , k ^ = 1 n + m a i i ^ T a k k ^ T 2 f x i ^ x k ^ , i , k = 1 n + m a i i T a k k T x i f x k T 1 b + i , k = 1 n i ^ , k ^ = 1 n + m a i i ^ T a k k ^ T x i ^ f x k ^ , i , k = 1 n + m a i i T a k k T 2 f x i x k T 1 c + i , k = 1 n i ^ , k ^ = 1 n + m a i i ^ T a k k ^ T x i ^ f x k ^ , i , k = 1 n + m a i i T a k k T x i f x k T 1 d .
The terms T 1 b = T 1 c , R 1 3 = R 1 5 , and R 2 5 = R 2 4 play the role of crossing terms inside the complete squares. In particular, for convenience, we change the index inside the sum of R 1 3 and R 2 5 , switching i , i ^ for R 1 3 and switching i , k ^ for R 2 5 . Then, we obtain the following.
2 R 1 3 = 2 i , k = 1 n i , i ^ , k ^ = 1 n + m a i i ^ T a i i T ( a k k ^ T x i ) ( x i ^ f x k ^ ) , ( a T ) k f = 2 i , k = 1 n i ^ , k ^ = 1 n + m i , l = 1 n + m a i i ^ T a i i T ( a k k ^ T x i ) ( x i ^ f x k ^ ) a k l T f x l 2 R 2 5 = 2 i , k = 1 n i , i ^ , k ^ = 1 n + m a k i T a i k ^ T a i i ^ T x i ( x k ^ f x i ^ ) , ( a T ) k f = 2 i , k = 1 n i ^ , k ^ = 1 n + m i , l = 1 n + m a k i T a i k ^ T a i i ^ T x i ( x k ^ f x i ^ ) a k l T f x l .
We denote
i ^ , k ^ = 1 n + m a i i ^ T a k k ^ T 2 f x i ^ x k ^ = γ i k .
The above Equality (55) can be represented in the following matrix form:
Q n 2 × ( n + m ) 2 X ( n + m ) 2 × 1 = ( γ 11 , , γ i k , , γ n n ) n 2 × 1 T ,
where Q and X are defined in (12) and (19). Now, we can represent term T 1 a as i , k = 1 n γ i k 2 = γ T γ = ( Q X ) T Q X = X T Q T Q X . Next, we want to represent R 1 3 and R 2 5 in the following form in terms of vector X:
2 R 1 3 2 R 2 5 = 2 i , k = 1 n i , i ^ , k ^ = 1 n + m a i i ^ T a i i T ( a k k ^ T x i ) ( x i ^ f x k ^ ) , ( a T ) k f R n 2 i , k = 1 n i , i ^ , k ^ = 1 n + m a k i T a i k ^ T a i i ^ T x i ( x k ^ f x i ^ ) , ( a T ) k f = 2 i ^ , k ^ = 1 n + m i , k = 1 n i = 1 n + m a i i ^ T a i i T ( a k k ^ T x i ) , ( a T ) k f a k i T a i k ^ T a i i ^ T x i , ( a T ) k f ( x i ^ f x k ^ ) = 2 C T X ,
where C is defined in (14). Similarly, we can represent T 1 b = T 1 c by X:
T 1 b = T 1 c = i , k = 1 n i ^ , k ^ = 1 n + m a i i ^ T a k k ^ T x i ^ f x k ^ , i , k = 1 n + m a i i T a k k T 2 f x i x k = D T Q X ,
where D is defined in (1). Summingover the above terms, we have the following quadratic form:
T 1 + 2 R 1 3 2 R 2 5 = X T Q T Q X + 2 D T Q X + 2 C T X + D T D .
Taking into account the fact that R 1 6 R 2 6 = 0 and R 1 4 R 2 3 = 0 , we have
T 1 + R 1 R 2 = T 1 + 2 R 1 3 2 R 2 5 + R 1 1 + R 1 2 R 2 1 R 2 2 ,
which completes the proof. □

5.3. Proof of Lemma 12

Lemma 17.
Γ 2 , L ˜ z ( f , f ) = X T P T P X + 2 E T P X + 2 F T X + E T E + R z ( f , f ) .
where R z is defined in Definition 2.
Proof. 
The proof follows directly from Lemmas 18 and 19. □
Lemma 18.
1 2 L ˜ Γ 1 z ( f , f ) Γ 1 z ( L ˜ f , f ) = 1 2 ( a T ( a T | z T f | 2 ) ) z T ( ( a T ) ( a T f ) ) , z T f R m .
Proof. 
Step 1: We first define Γ 1 z = z T f , z T f R m , then we have
L ˜ Γ 1 z ( f , f ) = Δ p Γ 1 z ( f , f ) A Γ 1 z ( f , f ) , Γ 1 z ( L ˜ f , f ) = Γ 1 z ( Δ p f , f ) Γ 1 z ( A f , f ) .
By our definition above, we directly obtain
Δ a Γ 1 z ( f , f ) = · ( a a T z T f , z T f R m ) = · ( a F z ) = i ^ = 1 n + m x i ^ ( k = 1 n a i ^ k F k z ) = i ^ = 1 n + m k = 1 n ( x i ^ a i ^ k F k z + a i ^ k x i ^ F k z ) = i ^ = 1 n + m k = 1 n ( x i ^ a i ^ k F k z ) + a T ( a T ( z T f ) 2 ) ,
where we denote
F z = a T z T f , z T f R m = a T l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) 2 = k ^ = 1 n + m a k k ^ T x k ^ l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) 2 k = 1 , , n = ( F 1 z , F 2 z , , F n z ) T .
We have
Δ a Γ 1 z ( f , f ) = i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T x k ^ l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) 2 + a T ( a T ( z T f ) 2 ) ) = k = 1 n i ^ = 1 n + m x i ^ a i ^ k k ^ = 1 n + m a k k ^ T x k ^ ( z T f ) 2 + ( a T ) ( a T ( z T f ) 2 ) = a ( a T ( z T f ) 2 ) + ( a T ) ( a T ( z T f ) 2 ) .
Next, we compute the following quantity.
Γ 1 z ( Δ a f , f ) = z T ( · ( a a T f ) ) , z T f R m .
From Lemma 15, we have
· ( a a T f ) = a ( a T f ) + ( a T ) ( a T f ) .
We continue with our computation as below:
Γ 1 z ( Δ a f , f ) = z T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x k ^ f ) + ( a T ) ( a T f ) , z T f R m = z T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x k ^ f ) , z T f R m + z T ( ( a T ) ( a T f ) ) , z T f R m
= z T a · ( a T f ) , z T f R m + z T ( ( a T ) ( a T f ) ) , z T f R m = z T a · ( a T f ) , z T f R m + a · ( z T ( a T f ) ) , z T f R m + z T ( ( a T ) ( a T f ) ) , z T f R m .
From the above, combining (58) and (59), we further obtain
1 2 Δ a Γ 1 z ( f , f ) Γ 1 z ( Δ a f , f ) = 1 2 ( a T ( a T | z T f | 2 ) ) z T ( ( a T ) ( a T f ) ) , z T f R m + 1 2 i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T x j ^ l = 1 n ( l ^ = 1 n + m z l l ^ T x l ^ f ) 2
z T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x k ^ f ) , z T f R m = 1 2 ( a T ( a T | z T f | 2 ) ) z T ( ( a T ) ( a T f ) ) , z T f R m + i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) x k ^ ( l ^ = 1 n + m z l l ^ T x l ^ f ) I l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) l = 1 n + m z l l T x l i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x k ^ f ) II .
Recall here that we denote a T to emphasize the transpose of the matrix a and a i i ^ T = a i ^ i :
I = i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) x k ^ ( l ^ = 1 n + m z l l ^ T x l ^ f ) = i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) ( l ^ = 1 n + m x k ^ z l l ^ T x l ^ f ) + i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) ( l ^ = 1 n + m z l l ^ T x k ^ x l ^ f ) = l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T ( l = 1 n + m x k ^ z l l T x l f ) + l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T ( l ^ = 1 n + m z l l T x k ^ x l f ) ; II = l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) l = 1 n + m z l l T x l i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x k ^ f ) = l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) l = 1 n + m z l l T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k x l ( a k k ^ T x k ^ f ) + l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) l = 1 n + m z l l T i ^ , k ^ = 1 n + m k = 1 n 2 x i ^ x l a i ^ k ( a k k ^ T x k ^ f ) = l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) l = 1 n + m z l l T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( x l a k k ^ T x k ^ f ) + l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) l = 1 n + m z l l T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( a k k ^ T x l x k ^ f ) + l = 1 n ( l ^ = 1 n + m z l l ^ T x l ^ f ) l = 1 n + m z l l T i ^ , k ^ = 1 n + m k = 1 n 2 x i ^ x l a i ^ k ( a k k ^ T x k ^ f ) .
Subtracting the above two terms, we obtain the following:
I II = l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) l = 1 n + m z l l T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( x l a k k ^ T x k ^ f ) l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) l = 1 n + m z l l T i ^ , k ^ = 1 n + m k = 1 n 2 x i ^ x l a i ^ k ( a k k ^ T x k ^ f ) + l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T ( l = 1 n + m x k ^ z l l T x l f ) .
Now, we eventually end up with the following formula:
1 2 Δ a Γ 1 z ( f , f ) Γ 1 z ( Δ a f , f ) = 1 2 ( a T ( a T | z T f | 2 ) ) z T ( ( a T ) ( a T f ) ) , z T f R m l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) l = 1 n + m z l l T i ^ , k ^ = 1 n + m k = 1 n x i ^ a i ^ k ( x l a k k ^ T x k ^ f ) l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) l = 1 n + m z l l T i ^ , k ^ = 1 n + m k = 1 n 2 x i ^ x l a i ^ k ( a k k ^ T x k ^ f ) + l = 1 m ( l ^ = 1 n + m z l l ^ T x l ^ f ) i ^ = 1 n + m k = 1 n x i ^ a i ^ k k ^ = 1 n + m a k k ^ T ( l = 1 n + m x k ^ z l l T x l f ) .
Step 2: Computation of 1 2 A Γ 1 z ( f , f ) + Γ 1 z ( A f , f ) . Now, we compute the last two terms of the above equation, with A = a a :
1 2 A Γ 1 z ( f , f ) = 1 2 k ^ = 1 n + m A k ^ x k ^ z T f , z T f R m = k ^ = 1 n + m A k ^ ( x k ^ z T ) f , z T f R m k ^ = 1 n + m A k ^ z T ( x k ^ f ) , z T f R m = J ˜ 1 + J ˜ 2 ,
Γ 1 z ( A f , f ) = z T ( k ^ = 1 n + m A k ^ x k ^ f ) , z T f R m = z T ( k ^ = 1 n + m A k ^ x k ^ f ) , z T f R m + z T ( k ^ = 1 n + m A k ^ x k ^ f ) , z T f R m = J ˜ 3 + J ˜ 4 .
It is easy to see that J ˜ 2 + J ˜ 3 = 0 . We now expand J ˜ 1 and J ˜ 4 into local coordinates:
J ˜ 1 = l = 1 m z T f l l , k ^ = 1 n + m k = 1 n k = 1 n + m a k ^ k x k a k k x k ^ z l l x l f ,
J ˜ 4 = l = 1 m ( z T f ) l l = 1 n + m z l l T ( k ^ = 1 n + m x l ( k = 1 n k = 1 n + m a k ^ k x k a k k ) x k ^ f ) = l = 1 m ( z T f ) l k = 1 n l = 1 n + m k ^ , k = 1 n + m z l l T x l a k ^ k x k a k k x k ^ f + l = 1 m ( z T f ) l k = 1 n l = 1 n + m k ^ , k = 1 n + m z l l T a k ^ k ( x l x k a k k ) x k ^ f .
Combining the above two steps, we thus obtain
1 2 L ˜ Γ 1 z ( f , f ) Γ 1 z ( L ˜ f , f ) = 1 2 ( a T ( a T | z T f | 2 ) ) z T ( ( a T ) ( a T f ) ) , z T f R m .
Lemma 19.
1 2 ( a T ( a T | z T f | 2 ) ) z T ( ( a T ) ( a T f ) ) , z T f R m = X T P T P X + 2 E T P X + 2 F T X + E T E + R z ( f , f ) .
Proof. 
We expand the two terms in Lemma 19.
1 2 ( a T ( a T | z T f | 2 ) ) = 1 2 i = 1 n k = 1 m ( a T ) i ( a T ) i | ( z T ) k f | 2 = i = 1 n k = 1 m ( a T ) i ( a T ) i ( z T ) k f , ( z T ) k f R m = i = 1 n k = 1 m ( a T ) i ( z T ) k f , ( a T ) i ( z T ) k f R m T ˜ 1 + i = 1 n k = 1 m ( a i i T x i ) ( a i i ^ T x i ^ ) ( z k k ^ T x k ^ ) f , ( z T ) k f R m R ˜ 1 .
z T ( [ ( a T ) ( a T f ) ] ) , z T f R m = i = 1 n k = 1 m ( z T ) k [ ( a T ) i ( a T ) i f ] , ( z T ) k f R m = i = 1 n k = 1 m ( z k k ^ T x k ^ ) [ ( a i i T x i ) ( a i i ^ T x i ^ ) f ] , ( z T ) k f R m R ˜ 2 .
Next, we expand R ˜ 1 and R ˜ 2 completely and obtain the following:
R ˜ 1 = i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m ( a i i T x i ) ( a i i ^ T x i ^ ) ( z k k ^ T f x k ^ ) , ( z T ) k f R m = i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m a i i T ( a i i ^ T x i z k k ^ T x i ^ f x k ^ ) , ( z T ) k f R m R ˜ 1 1 + i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m a i i T a i i ^ T ( x i z k k ^ T x i ^ ) ( f x k ^ ) , ( z T ) k f R m R ˜ 1 2 + i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m a i i T a i i ^ T ( z k k ^ T x i ^ ) ( x i f x k ^ ) , ( z T ) k f R m R ˜ 1 3 + i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m ( a i i T ) ( ( x i a i i ^ T ) z k k ^ T x i ^ f x k ^ ) , ( z T ) k f R m R ˜ 1 4 + i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m a i i T a i i ^ T ( x i z k k ^ T ) x i ^ f x k ^ ) , ( z T ) k f R m R ˜ 1 5 + i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m a i i T a i i ^ T z k k ^ T ( x i x i ^ f x k ^ ) , ( z T ) k f R m R ˜ 1 6
R ˜ 2 = i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m ( z k k ^ T x k ^ ) [ ( a i i T x i ) ( a i i ^ T f x i ^ ) ] , ( z T ) k f R m = i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m z k k ^ T a i i T x k ^ a i i ^ T x i f x i ^ ) , ( z T ) k f R m R ˜ 2 1 + i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m z k k ^ T a i i T ( x k ^ a i i ^ T x i ) f x i ^ , ( z T ) k f R m R ˜ 2 2 + i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m z k k ^ T a i i T a i i ^ T x i ( x k ^ f x i ^ ) , ( z T ) k f R m R ˜ 2 3 = R ˜ 1 4 + i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m z k k ^ T a i i T x k ^ a i i ^ T ( x i f x i ^ ) , ( z T ) k f R m R ˜ 2 4 + i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m z k k ^ T a i i T a i i ^ T x k ^ ( x i f x i ^ ) , ( z T ) k f R m R ˜ 2 5 + i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m z k k ^ T a i i T a i i ^ T ( x k ^ x i f x i ^ ) , ( z T ) k f R m R ˜ 2 6 = R ˜ 1 6
Our next step is to complete the squares for all the above terms. We look at term T ˜ 1 first.
T ˜ 1 = i = 1 n k = 1 m i ^ , k ^ = 1 n + m a i i ^ T z k k ^ T 2 f x i ^ x k ^ + i ^ , k ^ = 1 n + m a i i ^ T z k k ^ T x i ^ f x k ^ , i , k = 1 n + m a i i T z k k T 2 f x i x k + i , k = 1 n + m a i i T z k k T x i f x k = i = 1 n k = 1 m k = 1 m i ^ , k ^ = 1 n + m a i i ^ T z k k ^ T 2 f x i ^ x k ^ , i , k = 1 n + m a i i T z k k T 2 f x i x k T ˜ 1 a
+ i = 1 n k = 1 m i ^ , k ^ = 1 n + m a i i ^ T z k k ^ T 2 f x i ^ x k ^ , i , k = 1 n + m a i i T z k k T x i f x k T ˜ 1 b + i = 1 n k = 1 m i ^ , k ^ = 1 n + m a i i ^ T z k k ^ T x i ^ f x k ^ , i , k = 1 n + m a i i T z k k T 2 f x i x k T ˜ 1 c + i = 1 n k = 1 m i ^ , k ^ = 1 n + m a i i ^ T z k k ^ T x i ^ f x k ^ , i , k = 1 n + m a i i T z k k T x i f x k T ˜ 1 d .
The terms T ˜ 1 b = T ˜ 1 c , R ˜ 1 3 = R ˜ 1 5 , and R ˜ 2 5 = R ˜ 2 4 play the role of crossing terms inside the complete squares. In particular, for convenience, we changed the index inside the sum of R ˜ 1 3 and R ˜ 2 5 , switched i , i ^ for R ˜ 1 3 , and switched i , k ^ for R ˜ 2 5 , then we obtain the following.
2 R ˜ 1 3 = 2 i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m a i i ^ T a i i T ( z k k ^ T x i ) ( x i ^ f x k ^ ) , ( z T ) k f R n = 2 i = 1 n k = 1 m i ^ , k ^ = 1 n + m i , l = 1 n + m a i i ^ T a i i T ( z k k ^ T x i ) ( x i ^ f x k ^ ) z k l T f x l 2 R ˜ 2 5 = 2 i = 1 n k = 1 m i , i ^ , k ^ = 1 n + m z k i T a i k ^ T a i i ^ T x i ( x k ^ f x i ^ ) , ( z T ) k f = 2 i = 1 n k = 1 m i ^ , k ^ = 1 n + m i , l = 1 n + m z k i T a i k ^ T a i i ^ T x i ( x k ^ f x i ^ ) z k l T f x l
We denote
i ^ , k ^ = 1 n + m a i i ^ T z k k ^ T 2 f x i ^ x k ^ = ω i k .
The above Equality (63) can be represented in the following matrix form:
P ( n m ) × ( n + m ) 2 X ( n + m ) 2 × 1 = ( ω 11 , , ω i k , , ω n m ) ( n m ) × 1 T
where P and X are defined in (13) and (19). Now, we can represent term T ˜ 1 a as i = 1 n k = 1 m ω i k 2 = ω T ω = ( P X ) T P X = X T P T P X . Next, we want to represent R ˜ 1 3 and R ˜ 2 5 in the following form in terms of vector X:
2 R ˜ 1 3 2 R ˜ 2 5 = 2 i , k = 1 n i , i ^ , k ^ = 1 n + m a i i ^ T a i i T ( z k k ^ T x i ) ( x i ^ f x k ^ ) , ( z T ) k f R n 2 i , k = 1 n i , i ^ , k ^ = 1 n + m z k i T a i k ^ T a i i ^ T x i ( x k ^ f x i ^ ) , ( z T ) k f = 2 i ^ , k ^ = 1 n + m i , k = 1 n i = 1 n + m a i i ^ T a i i T ( z k k ^ T x i ) , ( z T ) k f z k i T a i k ^ T a i i ^ T x i , ( z T ) k f ( x i ^ f x k ^ ) = 2 F T X ,
where F is defined in (16). Similarly, we can represent T ˜ 1 b = T ˜ 1 c by X:
T ˜ 1 b = T ˜ 1 c = i , k = 1 n i ^ , k ^ = 1 n + m a i i ^ T z k k ^ T x i ^ f x k ^ , i , k = 1 n + m a i i T z k k T 2 f x i x k = E T P X
where E is defined in (17). We thus have the following form:
T ˜ 1 + 2 R ˜ 1 3 2 R ˜ 2 5 = X T P T P X + 2 E T P X + 2 F T X + E T E
Taking into account the fact that R 1 6 R 2 6 = 0 and R 1 4 R 2 3 = 0 , we have
T ˜ 1 + R ˜ 1 R ˜ 2 = T ˜ 1 + 2 R ˜ 1 3 2 R ˜ 2 5 + R ˜ 1 1 + R ˜ 1 2 R ˜ 2 1 R ˜ 2 2 ,
which completes the proof. □

6. Further Discussions on Other Inequalities

In this section, we apply the generalized Gamma calculus to study the entropic inequality for the semi-group P t associated with the drift–diffusion process. With a little abuse of notation, we denote the generator of the semi-group P t as 1 2 L instead of L, and we denote X t as the corresponding diffusion process.
Definition 5.
We define the semigroup P t = e 1 2 t L , where L is invariant with respect to the invariant measure d μ = π ( x ) d x . We denote P t f ( x ) = E ( f ( X t ) ) , and
E ( f ( X t ) ) = R n + m f ( y ) p ( t , x , y ) d μ ( y ) = R n + m f ( y ) ρ ( t , x , y ) d y ,
where the infinitesimal generator of this process X t is 1 2 L , and we denote ρ ( t , · , · ) as the product of the transition kernel p ( t , · , · ) and the volume measure π.
Remark 14.
Following the standard treatment as in [2] (Section 5), whenever we consider the differentiating operation on P t f , we shall always consider P t f ε first with f ε = f + ε , for ε > 0 . Then, we take the limit as ε 0 . Throughout this section, we directly use P t f instead of P t f ε for convenience.
Remark 15.
In the standard sub-Riemannian setting, the semi-groups are in general defined with respect to the invariant measure d μ ( y ) . In this paper, we formulate the semi-group and the transition kernel with respect to the Lebesgue measure d y .
Following the framework in [2], we also need the following assumption, which is necessary to rigorously justify the computations on functionals of the heat semigroup.
Assumption 2.
The semigroup P t is stochastically complete, that is, for t 0 , P t 1 = 1 and for any T > 0 and f C ( R n + m ) with compact support, we assume that
sup t [ 0 , T ] Γ ( P t f ) + Γ 1 z ( P t f ) < + .
We believe that the above Assumption 2 should follow from the the assumption R κ ( Γ 1 + Γ 1 z ) if we assume the appropriate lower bound κ . We leave this for further studies. Related gradient estimates are presented in order below. For the infinitesimal generator 1 2 L associated with linear semi-group P t , we have the following property.
Proposition 15.
For all smooth function f, we have:
  • P 0 = I d ;
  • For all functions f C b ( R n + m ) , the map t P t f is continuous from R + to L 2 ( d μ ) ;
  • For all s , t 0 , one has P t P s = P t + s ;
  • x R n + m , t 0 , t P t f ( x ) = 1 2 L ( P t f ) ( x ) = 1 2 P t ( L f ) ( x ) .
Next, we present the entropic inequality under Assumption 1. We follow closely the framework introduced in [2] and define the following two functionals:
ϕ a ( x , t ) = P T t f Γ 1 ( log P T t f ) ( x ) , and ϕ z ( x , t ) = P T t f Γ 1 z ( log P T t f ) ( x ) .
Lemma 20.
We have the following relation:
1 2 L ϕ a + t ϕ a = ( P T t f ) ( x ) Γ 2 ( log P T t f , log P T t f ) ( x ) , 1 2 L ϕ z + t ϕ z = ( P T t f ) ( x ) Γ 2 z ( log P T t f , log P T t f ) ( x ) + ( P T t f ) ( x ) Γ 1 ( log P T t f , Γ 1 z ( P T t f , P T t f ) ) ( x )
( P T t f ) ( x ) Γ 1 z ( log P T t f , Γ 1 ( P T t f , P T t f ) ) ( x ) .
Proof. 
Denote g ( t , x ) = P T t f ( x ) = ρ ( t , x , x ˜ ) f ( x ˜ ) d x ˜ , and we have the following relation:
L ( log g ) = Γ 1 ( g , g ) ( g ) 2 2 t g g .
By direct computation, one obtains
t ϕ a = t g Γ 1 ( log g , log g ) + 2 g a T log g , a T ( t g g ) R n = 1 2 L g Γ 1 ( log g , log g ) g Γ 1 ( log g , L log g ) g Γ 1 ( log g , Γ 1 ( log g , log g ) ) , 1 2 L ϕ a = 1 2 L g Γ 1 ( log g , log g ) + 1 2 g L Γ 1 ( log g , log g ) + Γ 1 ( g , Γ 1 ( log g , log g ) ) ,
where we have Γ 1 ( g , Γ 1 ( log g , log g ) ) = g Γ 1 ( log g , Γ 1 ( log g , log g ) ) ; thus, (66) is proven. Similarly, we obtain the following for ϕ z :
t ϕ z = t g Γ 1 z ( log g , log g ) + 2 g z T log g , z T ( t g g ) R m = 1 2 L g Γ 1 z ( log g , log g ) g Γ 1 z ( log g , L log g ) g Γ 1 z ( log g , Γ 1 ( log g , log g ) ) , 1 2 L ϕ z = 1 2 L g Γ 1 z ( log g , log g ) + 1 2 g L Γ 1 z ( log g , log g ) + Γ 1 ( g , Γ 1 z ( log g , log g ) ) .
The proof then follows. □
Now, we are ready to present the following important lemma, which prepares us to prove the new entropy inequality without the assumption:
Γ 1 ( log P T t f , Γ 1 z ( P T t f , P T t f ) ) ( x ) = Γ 1 z ( log P T t f , Γ 1 ( P T t f , P T t f ) ) ( x ) .
Lemma 21.
For any 0 < s < T , we denote ρ ( s , x , y ) = p ( s , x , y ) π ( y ) as the transition kernel of diffusion process X s x starting at x defined in Definition 5, and the following equality is satisfied:
E [ g Γ 1 ( log g , Γ 1 z ( log g , log g ) ) g Γ 1 z ( log g , Γ 1 ( log g , log g ) ) ] = · ( ρ ( s , x , y ) z z T Γ ( a a T ) ( log g ( s , y ) , log g ( s , y ) ) ) ρ ( s , x , y ) g ( s , y ) ρ ( s , x , y ) d y · ( ρ ( s , x , y ) a a T Γ ( z z T ) ( log g ( s , y ) , log g ( s , y ) ) ) ρ ( s , x , y ) g ( s , y ) ρ ( s , x , y ) d y .
Here, we denote g ( s , y ) = P T s f ( y ) = ρ ( s , y , y ˜ ) f ( y ˜ ) d y ˜ and
E [ g Γ 1 ( log g , Γ 1 z ( log g , log g ) ) ] = E [ g ( s , X s ) Γ 1 ( log g ( s , X s x ) , Γ 1 z ( log g ( s , X s x ) , log g ( s , X s x ) ) ) ] = g ( s , y ) Γ 1 ( log g ( s , y ) , Γ 1 z ( log g ( s , y ) , log g ( s , y ) ) ) ρ ( s , x , y ) d y .
Proof. 
We first expand in the following integral form.
E [ g Γ 1 ( log g , Γ 1 z ( log g , log g ) ) g Γ 1 z ( log g , Γ 1 ( log g , log g ) ) ] = g ( s , y ) Γ 1 ( log g ( s , y ) , Γ 1 z ( log g ( s , y ) , log g ( s , y ) ) ) ρ ( s , x , y ) d y g ( s , y ) Γ 1 z ( log g ( s , y ) , Γ 1 ( log g ( s , y ) , log g ( s , y ) ) ) ρ ( s , x , y ) d y .
We skip x , y , s for simplicity. Take log g = h .
Claim 1:
Γ 1 ( h , Γ 1 z ( h , h ) ) ρ g d y Γ 1 z ( h , Γ 1 ( h , h ) ) ρ g d y = Γ 1 z ( h , Δ a h ) ρ g d y Γ 1 z ( h , Δ a g g ) ρ g d y Γ 1 ( h , Δ z h ) ρ g d y + Γ 1 ( h , Δ z g g ) ρ g d y .
Recall that we denote Δ a = · ( a a T ) and Δ z = · ( z z T ) . Use the following identity:
Δ a h = Δ a g g Γ 1 ( g , g ) g 2 , and Δ z h = Δ z g g Γ 1 z ( g , g ) g 2 .
We then obtain
Γ 1 z ( h , Δ a h ) ρ g d y = Γ 1 z h , Δ a g g Γ 1 ( g , g ) g 2 ρ g d y = Γ 1 z ( h , Γ 1 ( h , h ) ) ρ g d y + Γ 1 z ( h , Δ a g g ) ρ g d y .
Similarly, the other equality is satisfied.
Claim 2:
Γ 1 z ( h , Δ a h ) ρ g d y Γ 1 z ( h , Δ a g g ) ρ g d y Γ 1 ( h , Δ z h ) ρ g d y + Γ 1 ( h , Δ z g g ) ρ g d y = · ( ρ z z T Γ ( a a T ) ( h , h ) ) ρ g ρ d y · ( ρ a a T Γ ( z z T ) ( h , h ) ) ρ g ρ d y .
First, observe that
Γ 1 z ( h , Δ a g g ) ρ g d y = z z T h , ( Δ a g g ) ρ g d y = · ( ρ z z T g ) Δ a g g d y = ρ g Δ a g Δ z g d y ρ , z z T g Δ a g g d y .
Similarly, one obtains
Γ 1 ( h , Δ z g g ) ρ g d y = ρ g Δ a g Δ z g d y ρ , a a T g Δ z g g d y .
For the next term, one obtains
Γ 1 z ( h , Δ a h ) ρ g d y = ( · ( a a T h ) ) , z z T h ρ g d y = [ · ( a a T h ) ] [ · ( ρ g z z T h ) ] d y = [ · ( a a T 1 g g ) ] · ( ρ z z T g ) d y = 1 g , a a T g + 1 g Δ a g · ( ρ z z T g ) d y = 1 g 2 g , a a T g ( · ( ρ z z T g ) ) d y 1 g Δ a g ( · ( ρ z z T g ) ) d y = 2 2 h ( a a T h , z z T h ) ρ g d y h , ( a a T ) h , z z T h ρ g d y 1 g Δ a g ρ , z z T g d y ρ g Δ a g Δ z g d y ,
where the last equality follows from the integration by parts for the first term and the direct expansion of the divergence for the second term. Similarly, we obtain
Γ 1 ( h , Δ z h ) ρ g d y = 2 2 h ( z z T h , a a T h ) ρ g d y h , ( z z T ) h , a a T h ρ g d y 1 g Δ z g ρ , a a T g d y ρ g Δ a g Δ z g d y .
Observing, by integration by parts, we obtain
h , ( a a T ) h , z z T h ρ g d y + h , ( z z T ) h , a a T h ρ g d y = · ( ρ z z T Γ ( a a T ) ( h , h ) ) ρ g ρ d y · ( ρ a a T Γ ( z z T ) ( h , h ) ) ρ g ρ d y .
Combining the above formulas, the proof is completed. □
With the above lemma in hand, we are ready to prove the following entropic inequality. We first define the following energy form:
Φ a ( x , t ) = P t P T t f Γ 1 ( log P T t f ) ( x ) , Φ z ( x , t ) = P t P T t f Γ 1 z ( log P T t f ) ( x ) .
Recall that we define
ϕ a ( x , t ) = P T t f Γ 1 ( log P T t f ) ( x ) , and ϕ z ( x , t ) = P T t f Γ 1 z ( log P T t f ) ( x ) .
Theorem 4.
Denote ϕ = ϕ a + ϕ z ; if the following condition is satisfied:
R κ ( Γ 1 + Γ 1 z ) ,
we then conclude
P T ( ϕ ( · , T ) ) ( x ) ϕ ( x , 0 ) + 0 T κ s ( Φ a ( x , s ) + Φ z ( x , s ) ) d s ,
where κ s depends on the estimate of the transition kernel log ρ ( s , · , · ) associated with semi-group P s (see Definition 5).
Remark 16.
Based on Theorem 3, we can also prove the above theorem for operator L ˜ with the drift term involved. Since the proof is similar, we skip the proof here.
Proof. 
Take ϕ = ϕ a + ϕ z . Let ( X t x ) t 0 be the diffusion Markov process with semigroup P t . (Similar proofs can be found in [2] (Proposition 4.5).) Let smooth function u : R n + m R be such that, for every T > 0 , sup t [ 0 , T ] u ( t , · ) < and sup t [ 0 , T ] 1 2 L u ( t , · ) + t u ( t , · ) < . We have for every t > 0
u ( t , X t x ) = u ( 0 , x ) + 0 T ( 1 2 L u + s u ) ( s , X s x ) d s + M t ,
where ( M t ) t 0 is a local martingale. Let T n , n N be an increasing sequence of stopping times such that, almost surely, T n and ( M t T n ) t 0 is a martingale. We obtain
E [ u ( t T n , X t T n x ) ] = u ( 0 , x ) + E [ 0 t T n ( 1 2 L u + s u ) ( s , X s x ) d s ] .
By using the dominated convergence theorem, we obtain
E [ u ( t , X t x ) ] = u ( 0 , x ) + E [ 0 t ( 1 2 L u + s u ) ( s , X s x ) d s ] .
Applying the above equality to ϕ ( t , X t x ) , we obtain
E [ ϕ ( t , X t x ) ] = ϕ ( 0 , x ) + E [ 0 T ( 1 2 L ϕ + s ϕ ) ( s , X s x ) d s ] = ϕ ( 0 , x ) + 0 T E [ ( 1 2 L ϕ + s ϕ ) ( s , X s x ) ] d s .
We now look at the term E [ ( 1 2 L ϕ + s ϕ ) ( s , X s x ) ] with g ( s , x ) = ( P T s f ) ( x ) = E [ f ( X t x ) ] = ρ ( x , y , s ) f ( y ) d y :
E [ ( 1 2 L ϕ + s ϕ ) ( s , X s x ) ] = E [ g Γ 2 ( log g , log g ) + g Γ 2 z ( log g , log g ) ] + E [ g Γ 1 ( log g , Γ 1 z ( log g , log g ) ) g Γ 1 z ( log g , Γ 1 ( log g , log g ) ) ] .
By using the above Lemma 21, let h = log g , and we obtain
E [ ( 1 2 L ϕ + s ϕ ) ( s , X s x ) ] = g ρ Γ 2 ( h , h ) + Γ 2 z ( h , h ) + · ( ρ z z T Γ ( a a T ) ( h , h ) ) ρ · ( ρ a a T Γ ( z z T ) ( h , h ) ) ρ d y = g ρ Γ 2 ( h , h ) + Γ ˜ 2 z , ρ ( h , h ) d y .
Applying Theorem 3 here with π = ρ ( s , · , · ) as the transition kernel function, we obtain a time-dependent version of Theorem 3. Assume that the following bound is satisfied where the bound κ s depends on kernel ρ ( s , · , · ) :
R ( f , f ) κ s ( Γ 1 ( f , f ) + Γ 1 z ( f , f ) ) .
We then conclude with the following bound:
E [ ( 1 2 L ϕ + s ϕ ) ( s , X s x ) ] ρ ( s , x , y ) g κ s ( Γ 1 ( h , h ) ( y ) + Γ 1 z ( h , h ) ( y ) ) d y = p ( s , x , y ) g κ s ( Γ 1 ( h , h ) ( y ) + Γ 1 z ( h , h ) ( y ) ) π ( y ) d y P s ( κ s g ( Γ 1 ( log g , log g ) + Γ 1 z ( log g , log g ) ) ) .
Plugging into the time integral 0 T E [ ( 1 2 L ϕ + s ϕ ) ( s , X s x ) ] d s , the proof follows. □
Remark 17.
We prove the entropic inequality Theorem 4 in this section without the the assumption: Γ 1 ( f , Γ 1 z ( f , f ) ) = Γ 1 z ( f , Γ 1 ( f , f ) ) . A similar entropic inequality under the assumption Γ 1 ( f , Γ 1 z ( f , f ) ) = Γ 1 z ( f , Γ 1 ( f , f ) ) was first proven in [2] (Proposition 4.5 and Theorem 5.2). With this new inequality Theorem 4 in hand, similar gradient estimates and other inequalities from [2] follow. We leave them for future studies. Proposition 4.5 in [2] is based on a pointwise estimate given the commutative assumption of Γ 1 and Γ 1 z . We removed the commutative assumption, and our estimate is in a weak form, which is presented in the above Lemma 21.

Author Contributions

Conceptualization, Q.F. and W.L.; methodology, Q.F. and W.L.; writing—original draft preparation, Q.F. and W.L.; writing—review and editing, Q.F. and W.L. All authors have read and agreed to the published version of the manuscript.

Funding

Wuchen Li is supported by AFOSR MURI FA9550-18-1-0502, the AFOSR YIP award: FA9550-23-1-0087, and NSF RTG: 2038080.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Degenerate SDEs and Sub-Riemannian Manifold

In this appendix, we briefly illustrate the formulation of the degenerate diffusion process and sub-Riemannian geometry.
For a smooth connected n + m -dimensional Riemannian manifold M n + m , we denote T M n + m as the tangent bundle of M n + m and denote τ as a sub-bundle of T M n + m . The sub-Riemannian structure associated with the sub-bundle τ on M n + m is denoted as ( τ , g τ ) , where g τ ( · , · ) is the metric associated with the sub-bundle τ . In particular, if we take distribution τ to be the horizontal sub-bundle, denoted as H , of the tangent bundle T M n + m (see [2,51] for more details), then we denote the sub-Riemannian structure as ( M n + m , H , g H ) . In this paper, we will not distinguish distributions τ and H and call this the horizontal sub-bundle. We assumed that the horizontal distribution H is bracket-generating (with any steps). The distribution H has dimension n.
For a vector field b R n + m and a general matrix a R ( n + m ) × n , we denote a = ( a 1 , a 2 , , a n ) with each a i , i = 1 , n , as an n + m -dimensional column vector. For any Stratonovich SDE,
d X t = b ( X t ) d t + 2 i = 1 n a i ( X t ) d B t i ,
where ( B t 1 , B t 2 , , B t n ) is an n-dimensional Brownian motion in R n and a i has local coordinates a i ( x ) = i ^ = 1 n + m a i ^ i ( x ) x i ^ . We consider (A1) as the SDE associated with a given sub-Riemannian structure, which is defined through the Lie algebra spanned by the driving vector fields of the SDE { a 1 , a 2 , , a n } . In general, we assumed that H : = { a 1 , a 2 , , a n } is of rank n and satisfies the bracket-generating condition (or Hörmander condition). To be precise, for any x M n + m , the Lie brackets of { a 1 ( x ) , a 2 ( x ) , , a n ( x ) } span the whole tangent space at x with dimension n + m . We define the manifold M n + m as the subspace of R n + m , where the diffusion process X t lives on. This spaces is described as the triple ( M n + m , H , g H ) , and we denote H as the n-dimensional horizontal distribution of the tangent bundle T M n + m generated by the vector fields { ( a 1 ( x ) , a 2 ( x ) , , a n ( x ) } . In this paper, we considered the case where the generator of the diffusion process (A1) coincides with the horizontal Laplacian operator (or sub-Laplacian operator) associated with the sub-Riemannian structure ( M n + m , H , g H ) . Furthermore, we assumed that there exists a symmetric and invariant volume measure associated with the horizontal Laplacian operator. The Stratonovich SDE (A1) without the drift ( d t ) term could be treated as a special case, where the horizontal Laplacian can be presented as the sum of squares of the horizontal vector fields in H . In particular, we considered the precise metric defined through the diffusion matrix a, which could be seen as an analogue for non-degenerate SDEs on Riemannian manifolds. The problem is that the rank of a a T is n; thus, the ( n + m ) × ( n + m ) matrix a a T is degenerate and cannot serve as a metric. We thus introduce the following metric, which is to formulate this sub-Riemannian structure in Euclidean space.
Definition A1.
Consider an orthonormal basis c = { c n + 1 ( x ) , , c n + m ( x ) } in R n + m , such that a i T c j = 0 for any 1 i n , n + 1 j m + n . We define a metric g = ( a a T + c c T ) 1 = ( a a T ) + c c T and a metric on the horizontal sub-bundle g τ = ( a a T ) , the pseudo-inverse of matrix a a T , on manifold M n + m .
The above definition is based on the following lemma.
Lemma A1.
The metric is g = ( a a T + c c T ) 1 = ( a a T ) + c c T .
Proof. 
For rank n matrix a a T , we denote its eigenvalue decomposition and the corresponding pseudo-inverse ( a a T ) as
a a T = i = 1 n λ i V i V i T , ( a a T ) = i = 1 n 1 λ i V i V i T .
Thus, we have a a T + c c T = i = 1 n λ i V i V i T + j = n + 1 n + m c j c j T . Furthermore, we have
a a T + c c T = ( V 1 , , V n , c n + 1 , , c n + m ) Λ n I m ( V 1 , , V n , c n + 1 , , c n + m ) T , ( a a T + c c T ) 1 = ( V 1 , , V n , c n + 1 , , c n + m ) Λ n 1 I m ( V 1 , , V n , c n + 1 , , c n + m ) T ,
where we denote Λ n = diag ( λ 1 , , λ n ) as the diagonal matrix for eigenvalues λ i s and I m as the m-dimensional identity matrix. Thus, the proof follows directly with
( a a T + c c T ) 1 = i = 1 n 1 λ i V i V i T + j = n + 1 n + m c j c j T = ( a a T ) + c c T .
With the new metric introduced above, we have the following lemma.
Lemma A2.
The vectors { a 1 , , a n } are the orthonormal basis under the metric g = ( a a T ) + c c T .
Proof. 
We just need to prove for a = ( a 1 , , a n ) with each ( a i ) ( n + m ) × 1 , and we have
a T g a = a T ( a a T ) a = Id n × n .
Notice that a i T c j = 0 , then we only need to prove a T ( a a T ) a = Id n × n . Let us denote a T ( a a T ) a = B , then we have
a a T ( a a T ) a a T = a B a T a a T = a B a T a T a a T a = a T a B a T a ( a T a ) 1 a T a a T a ( a T a ) 1 = B Id n × n = B ,
where the second equality follows from the property of the pseudo-inverse matrix and the last step follows from the fact that a T a is a non-degenerate n × n matrix, hence invertible. The proof then follows directly. □
We are now ready to introduce the following definition.
Definition A2.
Define ( M n + m , τ , g τ ) as the sub-Riemannian structure associated with the degenerate SDE (A1), where g τ = ( a a T ) denotes the horizontal metric, i.e., metric g is restricted onthe horizontal bundle τ. We denote R as the Levi-Civita connection on M n + m associated with our metric g = ( a a T ) + c c T , and let P τ R be the projection of the connection on the horizontal distribution τ. In particular, in our framework, we have P τ R f = a a T f , for any function f : M n + m R , where ∇ is the Euclidean gradient in R n + m .
Remark A1.
In Lemma A2, we show that { a 1 , a 2 , , a n } are the orthonormal basis for horizontal distribution τ under our metric g. In particular, we have
a a T f = ( a 1 , , a n ) a 1 a n f = i = 1 n ( a i f ) a i τ ,
which gives the local representation of P τ R f .
To demonstrate the definition clearly, we give the following example. On the Heisenberg group H 1 , we know that X = x 1 1 2 x 2 x 3 , Y = x 2 + 1 2 x 1 x 3 , Z = x 3 forms an orthonormal basis for the tangent bundle of H 1 . In particular, X and Y generate the horizontal distribution τ . If we start with the following SDE:
d W t = X d B t 1 + Y d B t 2 ,
then we know W t = ( B t 1 , B t 2 , 1 2 0 t B s 1 d B s 2 B s 2 d B s 1 ) , which is the horizontal Brownian motion on the Heisenberg group H 1 . The generator of the horizontal Brownian motion and the sub-Laplacian operator are the same, which is given by Δ H = X 2 + Y 2 , and the volume measure associated with Δ H is the Lebesgue measure on the Heisenberg group with the volume element equal to 1. Then, W t is a diffusion process in R 3 . In terms of our general sub-Riemannian structure introduced above, we can define
a = 1 0 0 1 x 2 2 x 1 2 = ( a 1 , a 2 ) = ( X , Y ) , c = 1 x 1 2 4 + x 2 2 4 + 1 x 2 2 x 1 2 1 ,
and
g H 1 , τ = ( a a T ) = 1 0 x 2 2 0 1 x 1 2 x 2 2 x 1 2 x 1 2 + x 2 2 4 , g H 1 = ( a a T ) + c c T .
In particular, the horizontal gradient is given by
a a T f = X f Y f x 2 2 X f + x 1 2 Y f = X f 1 0 x 2 2 + Y f 0 1 x 1 2 = ( X f ) X + ( Y f ) Y .
Thus, the sub-Riemannian structure associated with Stratonovich SDE (A2) is just ( H 1 , τ , g H 1 , τ ) , where g H 1 , τ is the restriction of metric g H 1 on the horizontal sub-bundle τ . Different from the standard construction of Brownian motion on a given Riemannian (sub-Riemannian) manifold by Ells–Elworthy–Malliavin [40,43], we can directly define our diffusion on the manifold M n + m by (A1) without performing projection from the orthonormal frame bundles. This is because the new metrics g = ( a a T ) + c c T and { a 1 , a 2 , , a n } are globally defined orthonormal basis of the (horizontal) sub-bundle on the tangent bundle T M n + m . Essentially, we first define (A1) in R n + m and then introduce the associated sub-Riemannian structure.
Remark A2.
Compared to the definition of the horizontal Brownian motion introduced in [38], the sub-Riemannian structure comes first with a totally geodesic Riemannian foliation structure, and then, SDE (A2) is defined on the given totally geodesic Riemannian foliation. In the current setting, we directly define the degenerate diffusion process by a first given matrix a, then we define the sub-Riemannian structure by introducing the new metric ( a a T ) + c c T .

Proof of Gradient Flow Assumption

In this subsection, we demonstrate that Equation (6) is in fact a Fokker–Planck equation of SDE (A1).
Lemma A3.
Consider the drift–diffusion process:
d X t = b ( X t ) d t + 2 a ( X t ) d B t ,
Suppose that b, a, π satisfy
a a b = a a T log π .
Then, the Fokker–Planck equation of X t satisfies
t ρ ( t , x ) = · ρ ( t , x ) a ( x ) a ( x ) T log ρ ( t , x ) π ( x ) .
Proof. 
Recall that we denote { a 1 , , a n } as the column vectors of matrix a. For Stratonovich SDE (A3), we can write
d X t = b ( X t ) d t + 2 i = 1 n a i ( X t ) d B t i .
According to [28] (Appendix 7), the corresponding Itô SDE is
d X t = 2 i = 1 n a i d B t i + ( i = 1 n a i a i + b ) d t .
Thus, the Fokker-Plank equation (Kolmogorov forward equation) satisfies
t ρ ( t , x ) = i = 1 n + m j = 1 n + m 2 x i x j ( ( a a T ) i j ρ ) · ( ( i = 1 n a i a i + b ) ρ ) = · ( a a T ρ ) + · ρ ( j = 1 n + m x j ( a a T ) i j ) i = 1 n + m ρ i = 1 n a i a i b ρ = · ( a a T ρ ) + · ( ρ ( a a b ) ) .
Namely, we have
t ρ ( t , x ) = · ( a a T ρ ) + · ( ρ ( a a b ) ) .
Plugging in the relation a a b = a a T log π , we have
t ρ ( t , x ) = · ( a a T ρ ) · ( ρ a a T log π ) = · ( ρ a a T log ρ ) · ( ρ a a T log π ) = · ( ρ a a T log ρ π ) .
Here, we use the fact that
ρ log ρ = ρ .
This finishes the proof. □
Example A1.
The Lie group SU ( 2 ) is a compact connected Lie group, diffeomorphic to the three-sphere S 3 . Following the construction of the left-invariant vector fields in [41] (Section 6.2), we change the coordinates in terms of coordinate system ( θ , ϕ , ψ ) . We obtain new left-invariant vector fields on SU ( 2 ) , with
X = cos ψ θ + sin ψ sin θ ϕ cos θ sin ψ sin θ ψ , Y = sin ψ θ + cos ψ sin θ ϕ cos θ cos ψ sin θ ψ , Z = ψ .
Thus, we have a = ( a 1 , a 2 ) = ( X , Y ) in the new coordinate system. We define the metric g = ( a a T ) . Here, X , Y are the orthonormal basis for the horizontal bundle generated by X , Y under metric ( a a T ) . According to [41] (Lemma 6.4), the invariant measure on SU ( 2 ) has the form of μ = sin ( θ ) d θ d ϕ d ψ . It is easy to check that the above Lemma is satisfied for b = 0 , π = sin ( θ ) , and
a a T log π = a a = cos θ sin θ 0 0 ,
where
a = cos ψ sin ψ sin ψ sin θ cos ψ sin θ cos θ sin ψ sin θ cos θ cos ψ sin θ .

References

  1. Bakry, D.; Émery, M. Diffusions hypercontractives. In Séminaire de Probabilités XIX 1983/84; Springer: Berlin/Heidelberg, Germany, 1985; pp. 177–206. [Google Scholar]
  2. Baudoin, F.; Garofalo, N. Curvature-dimension inequalities and Ricci lower bounds for sub-Riemannian manifolds with transverse symmetries. J. EMS 2017, 19, 151–219. [Google Scholar] [CrossRef]
  3. Arnold, A.; Carlen, E. A generalized Bakry–Émery condition for non-symmetric diffusions. In Proceedings of the EQUADIFF 99—International Conference on Differential Equations, Berlin, Germany, 1–7 August 1999; pp. 732–734. [Google Scholar]
  4. Li, W. Transport information geometry: Riemannian calculus on probability simplex. Inf. Geom. 2022, 5, 161–207. [Google Scholar] [CrossRef]
  5. Otto, F. The geometry of dissipative evolution equations the porous medium equation. Commun. Partial Differ. Equ. 2001, 26, 101–174. [Google Scholar] [CrossRef]
  6. Otto, F.; Villani, C. Generalization of an Inequality by Talagrand and Links with the Logarithmic Sobolev Inequality. J. Funct. Anal. 2000, 173, 361–400. [Google Scholar] [CrossRef]
  7. Baudoin, F. Wasserstein contraction properties for hypoelliptic diffusions. arXiv 2016, arXiv:1602.04177. [Google Scholar]
  8. Baudoin, F. Bakry–Émery meet Villani. J. Funct. Anal. 2017, 273, 2275–2291. [Google Scholar] [CrossRef]
  9. Baudoin, F.; Bonnefont, M.; Garofalo, N. A sub-Riemannian curvature-dimension inequality, volume doubling property and the Poincaré inequality. Math. Ann. 2014, 358, 833–860. [Google Scholar] [CrossRef]
  10. Baudoin, F.; Gordina, M.; Herzog, D.P. Gamma calculus beyond Villani and explicit convergence estimates for Langevin dynamics with singular potentials. Arch. Ration. Mech. Anal. 2021, 241, 765–804. [Google Scholar] [CrossRef]
  11. Baudoin, F.; Grong, E.; Kuwada, K.; Thalmaier, A. Sub-Laplacian comparison theorems on totally geodesic Riemannian foliations. Calc. Var. 2019, 58, 130. [Google Scholar] [CrossRef]
  12. Baudoin, F.; Wang, J. Curvature dimension inequalities and subelliptic heat kernel gradient bounds on contact manifolds. Potential Anal. 2014, 40, 163–193. [Google Scholar] [CrossRef]
  13. Feng, Q. Harnack inequalities on totally geodesic foliations with transverse Ricci flow. arXiv 2017, arXiv:1712.02275. [Google Scholar]
  14. Grong, E.; Thalmaier, A. Curvature-dimension inequalities on sub-Riemannian manifolds obtained from Riemannian foliations: Part I. Math. Z. 2015, 282, 99–130. [Google Scholar] [CrossRef]
  15. Grong, E.; Thalmaier, A. Curvature-dimension inequalities on sub-Riemannian manifolds obtained from Riemannian foliations: Part II. Math. Z. 2015, 282, 131–164. [Google Scholar] [CrossRef]
  16. Agrachev, A.; Lee, P. Optimal transportation under nonholonomic constraints. Trans. Am. Math. Soc. 2009, 361, 6019–6047. [Google Scholar] [CrossRef]
  17. Figalli, A.; Rifford, L. Mass transportation on sub-Riemannian manifolds. Geom. Funct. Anal. 2010, 20, 124–159. [Google Scholar] [CrossRef]
  18. Juillet, N. Diffusion by optimal transport in Heisenberg groups. Calc. Var. Partial Differ. Equ. 2014, 50, 693–721. [Google Scholar] [CrossRef]
  19. Khesin, B.; Lee, P. A nonholonomic Moser theorem and optimal transport. J. Symplectic Geom. 2009, 7, 381–414. [Google Scholar] [CrossRef]
  20. Lott, J.; Villani, C. Ricci Curvature for Metric-Measure Spaces via Optimal Transport. Ann. Math. 2009, 169, 903–991. [Google Scholar] [CrossRef]
  21. Sturm, K.-T. On the Geometry of Metric Measure Spaces. Acta Math. 2006, 196, 65–131. [Google Scholar] [CrossRef]
  22. Lafferty, J.D. The Density Manifold and Configuration Space Quantization. Trans. Am. Math. Soc. 1988, 305, 699–741. [Google Scholar] [CrossRef]
  23. Jüngel, A. Entropy Methods for Diffusive Partial Differential Equations; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
  24. Markowich, P.A.; Villani, C. On the Trend to Equilibrium for the Fokker–Planck Equation: An Interplay between Physics and Functional Analysis. Physics and Functional Analysis. Mat. Contemp. 1999, 19, 1–29. [Google Scholar]
  25. Arnold, A.; Einav, A.; Wöhrer, T. On the rates of decay to equilibrium in degenerate and defective Fokker-Planck equations. J. Differ. Equ. 2018, 264, 6843–6872. [Google Scholar] [CrossRef]
  26. Arnold, A.; Erb, J. Sharp entropy decay for hypocoercive and non-symmetric Fokker–Planck equations with linear drift. arXiv 2014, arXiv:1409.5425. [Google Scholar]
  27. Karatzas, I.; Shreve, S.E. Brownian Motion and Stochastic Calculus, 2nd ed.; Graduate Texts in Mathematics; Springer: New York, NY, USA, 1991; Volume 113. [Google Scholar]
  28. Baudoin, F. An Introduction to the Geometry of Stochastic Flows; World Scientific: Singapore, 2004. [Google Scholar]
  29. Stroock, D.W. Partial differential equations for probabilists. In Cambridge Studies in Advanced Mathematics; Cambridge University Press: Cambridge, UK, 2008; Volume 112. [Google Scholar]
  30. Bismut, J.M. Martingales, the Malliavin calculus and hypoellipticity under general Hörmander’s conditions. In Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete; Springer: Berlin/Heidelberg, Germany, 1981; Volume 56, pp. 469–505. [Google Scholar]
  31. Hörmander, L. Hypoelliptic second-order differential equations. Acta Math. 1967, 119, 147–171. [Google Scholar] [CrossRef]
  32. Arous, B.; Léandre, R. Décroissance exponentielle du noyau de la chaleur sur la diagonale (II). Probab. Theory Relat. Fields 1991, 90, 377–402. [Google Scholar] [CrossRef]
  33. Barlow, M.; Nualart, D. Lectures on Probability Theory and Statistics. In Ecole d’Ete de Probabilites de Saint-Flour XXV; Springer: Berlin/Heidelberg, Germany, 1995. [Google Scholar]
  34. Baudoin, F.; Nualart, E.; Ouyang, C.; Tindel, S. On probability laws of solutions to differential systems driven by a fractional Brownian motion. Ann. Probab. 2016, 44, 2554–2590. [Google Scholar] [CrossRef]
  35. Feng, Q.; Li, W. Hypoelliptic entropy dissipation for stochastic differential equations. arXiv 2021, arXiv:2102.00544. [Google Scholar]
  36. Agrachev, A.; Barilari, D.; Boscain, U. On the Hausdorff volume in sub-Riemannian geometry. Calc. Var. Partial Differ. Equ. 2012, 43, 355–388. [Google Scholar] [CrossRef]
  37. Barilari, D.; Rizzi, L. A formula for Popp’s volume in sub-Riemannian geometry. Anal. Geom. Metr. Spaces 2013, 1, 42–57. [Google Scholar] [CrossRef]
  38. Baudoin, F.; Feng, Q.; Gordina, M. Integration by parts and quasi-invariance for the horizontal Wiener measure on foliated compact manifolds. J. Funct. Anal. 2019, 277, 1362–1422. [Google Scholar] [CrossRef]
  39. Eldredge, N.; Gordina, M.; Saloff-Coste, L. Left-invariant geometries on SU(2) are uniformly doubling. Geom. Funct. Anal. 2018, 28, 1321–1367. [Google Scholar] [CrossRef]
  40. Elworthy, K.D. Stochastic Differential Equations on Manifolds; Cambridge University Press: Cambridge, UK, 1982; Volume 70. [Google Scholar]
  41. Gordina, M.; Laetsch, T. Sub-Laplacians on sub-Riemannian manifolds. Potential Anal. 2016, 44, 811–837. [Google Scholar] [CrossRef]
  42. Gordina, M.; Laetsch, T. A convergence to Brownian motion on sub-Riemannian manifolds. Trans. Am. Math. Soc. 2017, 369, 6263–6278. [Google Scholar] [CrossRef]
  43. Malliavin, P. Stochastic Analysis. In Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]; Springer: Berlin/Heidelberg, Germany, 1997; Volume 313. [Google Scholar]
  44. Baudoin, F.; Feng, Q. Log-Sobolev inequalities on the horizontal path space of a totally geodesic foliation. arXiv 2015, arXiv:1503.08180. [Google Scholar]
  45. Inglis, J.; Papageorgiou, I. Logarithmic Sobolev inequalities for infinite-dimensional Hörmander type generators on the Heisenberg group. Potential Anal. 2009, 31, 79–102. [Google Scholar] [CrossRef]
  46. Baudoin, F.; Bonnefont, M. Log-Sobolev inequalities for subelliptic operators satisfying a generalized curvature dimension inequality. J. Funct. Anal. 2012, 262, 2646–2676. [Google Scholar] [CrossRef]
  47. Wang, F.-Y. Logarithmic Sobolev inequalities on noncompact Riemannian manifolds. Probab. Theory Relat. Fields 1997, 109, 417–424. [Google Scholar] [CrossRef]
  48. Woit, P. Quantum Theory, Groups and Representations; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
  49. Baudoin, F.; Cecil, M. The subelliptic heat kernel on the three-dimensional solvable Lie groups. Forum Math. 2015, 27, 2051–2086. [Google Scholar] [CrossRef]
  50. Li, W. Diffusion Hypercontractivity via Generalized Density Manifold. arXiv 2019, arXiv:1907.12546. [Google Scholar]
  51. Baudoin, F. Sub-Laplacians and hypoelliptic operators on totally geodesic Riemannian foliations. arXiv 2014, arXiv:1410.3268. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Feng, Q.; Li, W. Entropy Dissipation for Degenerate Stochastic Differential Equations via Sub-Riemannian Density Manifold. Entropy 2023, 25, 786. https://doi.org/10.3390/e25050786

AMA Style

Feng Q, Li W. Entropy Dissipation for Degenerate Stochastic Differential Equations via Sub-Riemannian Density Manifold. Entropy. 2023; 25(5):786. https://doi.org/10.3390/e25050786

Chicago/Turabian Style

Feng, Qi, and Wuchen Li. 2023. "Entropy Dissipation for Degenerate Stochastic Differential Equations via Sub-Riemannian Density Manifold" Entropy 25, no. 5: 786. https://doi.org/10.3390/e25050786

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop