Poincaré and Log–Sobolev Inequalities for Mixtures

Schlichting, André

doi:10.3390/e21010089

Open AccessArticle

Poincaré and Log–Sobolev Inequalities for Mixtures

by

André Schlichting

Institut für Geometrie und Praktische Mathematik, RWTH Aachen, Templergraben 55, 52056 Aachen, Germany

Entropy 2019, 21(1), 89; https://doi.org/10.3390/e21010089

Submission received: 30 November 2018 / Revised: 30 December 2018 / Accepted: 11 January 2019 / Published: 18 January 2019

(This article belongs to the Special Issue Entropy and Information Inequalities)

Download Versions Notes

Abstract

This work studies mixtures of probability measures on

R^{n}

and gives bounds on the Poincaré and the log–Sobolev constants of two-component mixtures provided that each component satisfies the functional inequality, and both components are close in the

χ^{2}

-distance. The estimation of those constants for a mixture can be far more subtle than it is for its parts. Even mixing Gaussian measures may produce a measure with a Hamiltonian potential possessing multiple wells leading to metastability and large constants in Sobolev type inequalities. In particular, the Poincaré constant stays bounded in the mixture parameter, whereas the log–Sobolev may blow up as the mixture ratio goes to 0 or 1. This observation generalizes the one by Chafaï and Malrieu to the multidimensional case. The behavior is shown for a class of examples to be not only a mere artifact of the method.

Keywords:

Poincaré inequality; log–Sobolev inequality; relative entropy; fisher information; Dirichlet form; mixture; finite Gaussian mixtures

1. Introduction

A mixture of two probability measures

μ_{0}

and

μ_{1}

on

R^{n}

is for some parameter

p \in [0, 1]

the probability measure

μ_{p}

defined by

μ_{p} : = p μ_{0} + (1 - p) μ_{1} .

(1)

Hereby, both measures

μ_{0}

and

μ_{1}

are assumed to be absolutely continuous with respect to the Lebesgue measure and their supports are nested, i.e.,

supp μ_{0} \subseteq supp μ_{1}

or

supp μ_{1} \subseteq supp μ_{0}

. Under these assumptions at least one measure is absolutely continuous to the other one

μ_{0} ≪ μ_{1} or μ_{1} ≪ μ_{0},

which implies that at least one of the measures has a density with respect to the other one

d μ_{0} = \frac{d^{} μ_{0}}{d μ_{1}^{}} d μ_{1} or d μ_{1} = \frac{d^{} μ_{1}}{d μ_{0}^{}} d μ_{0} .

This work establishes criteria to check in a simple way under which a mixture of measures satisfies a Poincaré

PI (ϱ)

or log–Sobolev inequality

LSI (α)

with constants

ϱ

and

α

, respectively, provided that each of the components satisfies one.

Definition 1

(

PI (ϱ)

and

LSI (α)

). A probability measure μ on

R^{n}

satisfies the Poincaré inequality with constant

ϱ > 0

, if for all functions

f : R^{n} \to R

{Var}_{μ} [f] : = \int {| f - \int f d μ |}^{2} d μ \leq \frac{1}{ϱ} \int {| \nabla f |}^{2} d μ .

PI(ϱ)

A probability measure μ satisfies the log–Sobolev inequality with constant

α > 0

, if for all functions

f : R^{n} \to R^{+}

holds

{Ent}_{μ} [f] : = \int f \log f d μ - \int f d μ \log (\int f d μ) \leq \frac{1}{α} \int \frac{{| \nabla f |}^{2}}{2 f} d μ .

LSI(α)

By the change of variable

f \mapsto f^{2}

, the log–Sobolev inequality

LSI (α)

is equivalent to

{Ent}_{μ} [f^{2}] \leq \frac{2}{α} \int {| \nabla f |}^{2} d μ .

(2)

The question of how

ϱ_{p}

and

α_{p}

in

PI (ϱ_{p})

and

LSI (α_{p})

depend for a mixture

μ_{p}

on the parameter

p \in [0, 1]

was first studied by Chafaï and Malrieu [1] for measures on

R^{n}

. The aim is to deduce simple criteria under which the measure

μ_{p}

(1) satisfies

PI (ϱ_{p})

and

LSI (α_{p})

knowing that

μ_{0}

and

μ_{1}

satisfy

PI (ϱ_{0})

,

PI (ϱ_{1})

and

LSI (α_{0})

,

LSI (α_{1})

, respectively. The approach by Chafaï and Malrieu [1] is based on a functional depending on the distribution function of the measures

μ_{0}

and

μ_{1}

, which then lead to bounds on the Poincaré and log–Sobolev constant of the mixture in one dimension.

This work generalizes part of the results from Chafaï and Malrieu [1] to the multidimensional case by a simple argument. The estimates on the Poincaré and log–Sobolev constants hold for the case, when the

χ^{2}

-distance of

μ_{0}

and

μ_{1}

is bounded (see Label (5) for its definition). For this to be true, at least one of the measures

μ_{0}

and

μ_{1}

needs to be absolutely continuous to the other, which is also a necessary condition for the mixture having connected support. The resulting bound is optimal in the scaling behavior of the mixture parameter

p \to 0, 1

, i.e., a logarithmic blow-up behavior in p for the log–Sobolev constant, whereas the Poincaré constant stays bounded. This different behavior of the Poincaré and log–Sobolev constant was also observed in the setting of metastability in ([2], Remark 2.20).

Let us first introduce the principle for the Poincaré inequality in Section 2 and then for the log–Sobolev inequality in Section 3. Then, the procedure is illustrated on specific examples of mixtures in Section 4.

2. Poincaré Inequality

To keep the presentation concise, the following notation for the mean of a function

f : R^{n} \to R

with respect to a measure

μ

is introduced

E_{μ} [f] : = \int f d μ .

In this way, the variance in

PI (ϱ)

and relative entropy in

LSI (α)

become

{Var}_{μ} [f] = E_{μ} [({f - E_{μ} [f])}^{2}] = E_{μ} [f] - {(E_{μ} [f])}^{2} and {Ent}_{μ} [f] = E_{μ} [f \log f] - E_{μ} [f] \log (E_{μ} [f]) .

Likewise, the covariance of two functions

f, g : R^{n} \to R

is defined by

{Cov}_{μ} [f, g] = E_{μ} [(f - E_{μ} [f] (g - E_{μ} [g]) = E_{μ} [f g] - E_{μ} [f] E_{μ} [g] .

The Cauchy–Schwarz inequality for the covariance takes now the form

{Cov}_{μ} [f, g] \leq {Var}_{μ} [f] {Var}_{μ} [g] .

The argument is based on an easy but powerful observation for measures

μ_{0}

and

μ_{1}

with joint support.

Lemma 1

(Mean-difference as covariance). If

supp μ_{0} = supp μ_{1}

, then for any

ϑ \in [0, 1]

and any function

f : R^{n} \to R

holds

E_{μ_{0}} [f] - E_{μ_{1}} [f] = - ϑ {Cov}_{μ_{0}} [f, \frac{d^{} μ_{1}}{d μ_{0}^{}}] + (1 - ϑ) {Cov}_{μ_{1}} [f, \frac{d^{} μ_{0}}{d μ_{1}^{}}] .

(3)

Proof.

The change of measure formula yields that the covariances above are just the difference of the expectation on the right-hand side

{Cov}_{μ_{0}} [f, \frac{d^{} μ_{1}}{d μ_{0}^{}}] = E_{μ_{0}} [f \frac{d^{} μ_{1}}{d μ_{0}^{}}] - E_{μ_{0}} [f] E_{μ_{0}} [\frac{d^{} μ_{1}}{d μ_{0}^{}}] = E_{μ_{1}} [f] - E_{μ_{0}} [f]

and likewise for

{Cov}_{μ_{1}} [f, \frac{d^{} μ_{1}}{d μ_{0}^{}}]

. □

The subsequent strategy is based on the identity (3) by using a Cauchy–Schwarz inequality to arrive at the product of two variances. Then,

PI (ϱ_{0})

or

PI (ϱ_{1})

can be applied and the parameter

ϑ

leaves freedom to optimize the resulting expression. This allows for proving the following theorem, which is the generalization of ([1], Theorem 4.4) to the multidimensional case for the Poincaré inequality provided

μ_{0}

and

μ_{1}

are absolutely continuous to each other.

Theorem 1 (PI for absolutely continuous mixtures).

Let

μ_{0}

and

μ_{1}

satisfy

PI (ϱ_{0})

and

PI (ϱ_{1})

, respectively, and let both measures be absolutely continuous to each other. Then, for all

p \in (0, 1)

and

q = 1 - p

, the mixture measure

μ_{p} = p μ_{0} + q μ_{1}

satisfies

PI (ϱ_{p})

with

\frac{1}{ϱ_{p}} \leq {\begin{cases} \frac{1}{ϱ_{0}}, & if \frac{ϱ_{1}}{ϱ_{0}} \geq 1 + p χ_{1}, \\ \frac{1}{ϱ_{1}}, & if \frac{ϱ_{0}}{ϱ_{1}} \geq 1 + q χ_{0}, \\ \frac{p χ_{1} + p q χ_{0} χ_{1} + q χ_{0}}{ϱ_{0} p χ_{1} + ϱ_{1} q χ_{0}}, & else, \end{cases}

(4)

where

χ_{0} : = {Var}_{μ_{0}} [\frac{d^{} μ_{1}}{d μ_{0}^{}}] and χ_{1} : = {Var}_{μ_{1}} [\frac{d^{} μ_{0}}{d μ_{1}^{}}] .

(5)

Proof.

The variance of f with respect to

μ_{p}

is decomposed to

{Var}_{μ_{p}} [f] = p {Var}_{μ_{0}} [f] + q {Var}_{μ_{1}} [f] + p q {(E_{μ_{0}} [f] - E_{μ_{1}} [f])}^{2} .

Hereby, the first two terms are just the expectation of the conditional variances. The third term is the variance of a Bernoulli random variable. Now, the mean-difference is rewritten by Lemma 1 and the square is estimated with the Young inequality introducing an additional parameter

η > 0

{(a + b)}^{2} \leq (1 + η) a^{2} + (1 + η^{- 1}) b^{2} .

Then, the Cauchy–Schwarz inequality is applied to the covariances to obtain

\begin{matrix} {Var}_{μ} [f] & \leq p {Var}_{μ_{0}} [f] + q {Var}_{μ_{1}} [f] + \\ + p q ((1 + η) ϑ^{2} {Cov}_{μ_{0}}^{2} [f, \frac{d^{} μ_{1}}{d μ_{0}^{}}] + (1 + η^{- 1}) {(1 - ϑ)}^{2} {Cov}_{μ_{1}} [f, \frac{d^{} μ_{0}}{d μ_{1}^{}}]) \\ \leq (1 + (1 + η) ϑ^{2} q χ_{0}) p {Var}_{μ_{0}} [f] + (1 + (1 + η^{- 1}) {(1 - ϑ)}^{2} p χ_{1}) q {Var}_{μ_{1}} [f] \\ \leq \frac{1 + (1 + η) ϑ^{2} q χ_{0}}{ϱ_{0}} {\int | \nabla f |}^{2} p d μ_{0} + \frac{1 + (1 + η^{- 1}) {(1 - ϑ)}^{2} p χ_{1}}{ϱ_{1}} \int {| \nabla f |}^{2} q d μ_{1} \\ \leq \max {\frac{1 + (1 + η) ϑ^{2} q χ_{0}}{ϱ_{0}}, \frac{1 + (1 + η^{- 1}) {(1 - ϑ)}^{2} p χ_{1}}{ϱ_{1}}} \int | \nabla f |^{2} d μ . \end{matrix}

(6)

The resulting maximum is now minimized in

η > 0

and

ϑ \in [0, 1]

. To do so without loss of generality,

ϱ_{0} \geq ϱ_{1}

is assumed. The other case can always be obtained by interchanging the roles of

μ_{0}

and

μ_{1}

. If

ϱ_{0} > ϱ_{1}

, then

ϑ = 1

and

η \to 0

is optimal as long as

\frac{1 + q χ_{0}}{ϱ_{0}} \leq \frac{1}{ϱ_{1}} .

This corresponds to the second case in (4). By symmetry, the first case follows if

ϱ_{1} \geq ϱ_{0}

.

Now, in the case

ϱ_{0} \geq ϱ_{1}

and

ϱ_{0} \leq (1 + q χ_{0}) ϱ_{1}

, there exists by monotonicity for every

ϑ \in (0, 1)

a unique

η_{*} = η_{*} (ϑ) > 0

such that both terms in the max of the right-hand side in (6) are equal and hence the max is minimal. Since

q χ_{0} > 0

and

p χ_{1} > 0

, the sum of the coefficients in the front is then given by

h (ϑ) : = (1 + η) ϑ^{2} + (1 + \frac{1}{η}) {(1 - ϑ)}^{2}

in

ϑ

as a function of

η

. The minimization of h in

ϑ \in (0, 1)

leads to

ϑ_{*} = \frac{1}{1 + η}

and

h (ϑ^{*}) = \frac{1}{1 + η} + \frac{η}{1 + η} = 1

holds. Hence, in this case, the parameter

s = (1 + η_{*}) ϑ_{*}^{2} = \frac{1}{1 + η_{*}} \in (0, 1)

and

(1 + η_{*}^{- 1}) {(1 - ϑ_{*})}^{2} = \frac{η_{*}}{1 + η_{*}} = 1 - s

. Thus, the problem can be rephrased: Find

s_{*} \in (0, 1)

which solves

\frac{1 + s q χ_{0}}{ϱ_{0}} = \frac{1 + (1 - s) p χ_{1}}{ϱ_{1}} .

The solution

s_{*}

is given by

s_{*} = \frac{(1 + p χ_{1}) ϱ_{0} - ϱ_{1}}{ϱ_{0} p χ_{1} + ϱ_{1} q χ_{0}} .

For this value of

s_{*}

, the value of the max in (6) is given by

\frac{1 + s_{*} q χ_{0}}{ϱ_{0}} = \frac{p χ_{1} + \frac{ϱ_{1}}{ϱ_{0}} q χ_{0} + (1 + p, χ_{1}) q χ_{0} - \frac{ϱ_{1}}{ϱ_{0}} q χ_{0}}{ϱ_{0} p χ_{1} + ϱ_{1} q χ_{0}} = \frac{p χ_{1} + p q χ_{0} χ_{1} + q χ_{0}}{ϱ_{0} p χ_{1} + ϱ_{1} q χ_{0}} .

□

Remark 1.

The constants

χ_{0}

and

χ_{1}

can be rewritten if

μ_{0}

and

μ_{1}

are mutual absolutely continuous as

χ_{0} = \int {(\frac{d μ_{1}}{d μ_{0}})}^{2} d μ_{0} - 1 = \int \frac{d μ_{1}}{d μ_{0}} d μ_{1} - 1 and χ_{1} = \int {(\frac{d μ_{0}}{d μ_{1}})}^{2} d μ_{1} - 1 = \int \frac{d μ_{0}}{d μ_{1}} d μ_{0} - 1 .

This quantity is also known as

χ^{2}

-distance on the space of probability measures (cf. [3]). The

χ^{2}

-distance is a rather weak distance and therefore bounds many other probability distances. Among them is also the relative entropy. Indeed, by the concavity of the logarithm and the Jensen inequality follows

{Ent}_{μ_{0}} [\frac{d^{} μ_{1}}{d μ_{0}^{}}] = \int \log \frac{μ_{1}}{μ_{0}} d μ_{1} \leq \log (\int \frac{μ_{1}}{μ_{0}} d μ_{1}) = \log (1 + χ_{0}) \leq χ_{0} .

Remark 2.

The proof of Theorem 1 shows that the expression for

\frac{1}{ϱ}

in the last case of (4) can be bounded above and below by

\max {\frac{1}{ϱ_{0}}, \frac{1}{ϱ_{1}}} \leq \frac{p χ_{1} + p q χ_{0} χ_{1} + q χ_{0}}{ϱ_{0} p χ_{1} + ϱ_{1} q χ_{0}} \leq \max {\frac{1 + q χ_{0}}{ϱ_{0}}, \frac{1 + p χ_{1}}{ϱ_{1}}} .

(7)

In the case, where

χ_{0} = χ_{1} = χ

, the formula for

ϱ_{p}

(4) simplifies to

\frac{1}{ϱ_{p}} \leq \frac{1 + p q χ}{p ϱ_{0} + q ϱ_{1}} .

(8)

Corollary 1.

Let

μ_{0} ≪ μ_{1}

and

μ_{0}

,

μ_{1}

satisfy

PI (ϱ_{0})

,

PI (ϱ_{1})

, respectively. Then, for all

p \in [0, 1]

with

q = 1 - p

, the mixture measure

μ_{p} = p μ_{0} + q μ_{1}

satisfies

PI (ϱ_{p})

with

\frac{1}{ϱ_{p}} = \max {\frac{1}{ϱ_{0}}, \frac{1 + p χ_{1}}{ϱ_{1}}} .

Likewise, if

μ_{1} ≪ μ_{0}

, then it holds

\frac{1}{ϱ_{p}} = \max {\frac{1}{ϱ_{1}}, \frac{1 + q χ_{0}}{ϱ_{0}}} .

Proof.

The proof is a simple consequence of Lemma 1 with

ϑ = 0

and a similar line of estimates as in (6). □

3. Log–Sobolev Inequality

In this section, a criterion for

LSI (α)

is established. It will be convenient to establish it in the form (2). For a function

g : R^{n} \to R^{+}

and two probability measures

μ_{0}

and

μ_{1}

, the averaged function

\bar{g} : 0, 1 \to R^{+}

is defined by

\bar{g} (0) : = E_{μ_{0}} [g] and \bar{g} (1) : = E_{μ_{1}} [g] .

Moreover, the mixture of two Dirac measures

δ_{0}

and

δ_{1}

is by slight abuse of notation denoted by

δ_{p} : = p δ_{0} + q δ_{1}

for

p \in [0, 1]

and

q = 1 - p

. Then, the entropy of the mixture

μ_{p} = p μ_{0} + q μ_{1}

is given by

{Ent}_{μ_{p}} [f^{2}] = p {Ent}_{μ_{0}} [f^{2}] + q {Ent}_{μ_{1}} [f^{2}] + {Ent}_{δ_{p}} [\bar{f^{2}}] .

(9)

The following discrete log–Sobolev inequality for a Bernoulli random variable is used to estimate the entropy of the averaged function

\bar{f}

. The optimal log–Sobolev constant was found by Higuchi and Yoshida [4] and Diaconis and Saloff-Coste ([5], Theorem A.2.) at the same time.

Lemma 2 (Optimal log–Sobolev inequality for Bernoulli measures).

A Bernoulli measure

μ_{p}

on

0, 1

, i.e., a mixture of two Dirac measures

δ_{p} = p δ_{0} + q δ_{1}

with

p \in [0, 1]

and

q = 1 - p

satisfies the discrete log–Sobolev inequality

{Ent}_{δ_{p}} [g] \leq \frac{p q}{Λ (p, q)} {(g (0) - g (1))}^{2} for all g : 0, 1 \to R^{+},

where

Λ : R^{+} \times R^{+} \to R^{+}

is the logarithmic mean defined by

Λ (p, q) : = \frac{p - q}{\log p - \log q}, for p \neq q and Λ (p, p) : = \lim_{q \to p} Λ (p, q) = p .

The above result allows for estimating the coarse-grained entropy in (9).

Lemma 3 (Estimate of the coarse-grained entropy).

Let

\bar{f^{2}} : {0, 1} \to R^{+}

be given by

\bar{f^{2}} (i) : = E_{μ_{i}} [f^{2}]

for

i \in {0, 1}

. Then, for all

p \in [0, 1]

and

q = 1 - p

,

{Ent}_{δ_{p}} [\bar{f^{2}}] \leq \frac{p q}{Λ (p, q)} ({Var}_{μ_{0}} [f] + {Var}_{μ_{1}} [f] + {(E_{μ_{0}} [f] - E_{μ_{1}} [f])}^{2})

(10)

holds.

Proof.

Lemma 2 applied to

{Ent}_{δ_{p}} (\bar{f^{2}})

yields

\begin{matrix} {Ent}_{\bar{μ}} (\bar{f^{2}}) & \leq \frac{p q}{Λ (p, q)} {(\sqrt{\bar{f^{2}} (0)} - \sqrt{\bar{f^{2}} (1)})}^{2} . \end{matrix}

(11)

The square-root-mean-difference on the right-hand side of (11) can be estimated by using the fact that the function

(a, b) \mapsto {(\sqrt{a} - \sqrt{b})}^{2}

is jointly convex on

R^{+} \times R^{+}

. Indeed, by introducing the functions

f_{0}, f_{1} : R^{n} \times R^{n} \to R^{+}

defined by

f_{0} (x, y) = f (x)

and

f_{1} (x, y) = f (y)

, an application of the Jensen inequality yields the estimate

\begin{matrix} ({\sqrt{E_{μ_{0}} [f^{2}]} - \sqrt{E_{μ_{1}} [f^{2}]})}^{2} & = ({\sqrt{E_{μ_{0} \times μ_{1}} [f_{0}^{2}]} - \sqrt{E_{μ_{0} \times μ_{1}} [f_{1}^{2}]})}^{2} \\ \leq E_{μ_{0} \times μ_{1}} [{(f_{0} - f_{1})}^{2}] \\ \leq E_{μ_{0}} [f^{2}] - 2 E_{μ_{0}} [f] E_{μ_{1}} [f] + E_{μ_{1}} [f^{2}] \\ = {Var}_{μ_{0}} [f] + {Var}_{μ_{1}} [f] + {(E_{μ_{0}} [f] - E_{μ_{1}} [f])}^{2} . \end{matrix}

(12)

Now, a combination (11) and (12) gives (10). □

The decomposition (9) together with (10) yields that a mixture

μ_{p} = p μ_{0} + q μ_{1}

for

p \in [0, 1]

and

q = 1 - p

satisfies

\begin{matrix} {Ent}_{μ_{p}} [f^{2}] & \leq p {Ent}_{μ_{0}} [f^{2}] + q {Ent}_{μ_{1}} [f^{2}] \\ + \frac{p q}{Λ (p, q)} {Var}_{μ_{0}} [f] + {Var}_{μ_{1}} [f] + {(E_{μ_{0}} [f] - E_{μ_{1}} [f])}^{2} . \end{matrix}

(13)

The right-hand side of (13) consists of quantities, which can be estimated under the assumption that

μ_{0}

and

μ_{1}

satisfy

LSI (α_{0})

and

LSI (α_{1})

. The following theorem provides an extension of the result ([1] Theorem 4.4) to the multidimensional case for the log–Sobolev inequality.

Theorem 2 (LSI for absolutely continuous mixtures).

Let

μ_{0}

and

μ_{1}

satisfy

LSI (α_{0})

and

LSI (α_{1})

, respectively, and let both measures be absolutely continuous to each other. Then, for all

p \in (0, 1)

and

q = 1 - p

, the mixture measure

μ_{p} = p μ_{0} + q μ_{1}

satisfies

LSI (α_{p})

with

\frac{1}{α_{p}} \leq {\begin{cases} \frac{1 + q λ_{p}}{α_{0}}, & if \frac{α_{1}}{α_{0}} \geq 1 + p λ_{p} (1 + χ_{1} (1 + q λ_{p})), \\ \frac{1 + p λ_{p}}{α_{1}}, & if \frac{α_{0}}{α_{1}} \geq 1 + q λ_{p} (1 + χ_{0} (1 + p λ_{p})), \\ \frac{p (1 + q λ_{p}) χ_{1} + p q λ_{p} χ_{0} χ_{1} + q (1 + p λ_{p}) χ_{0}}{α_{0} p χ_{1} + α_{1} q χ_{0}}, & else . \end{cases}

(14)

Hereby,

χ_{0}

and

χ_{1}

are given in (5) and

λ_{p}

is used for the inverse logarithmic mean

λ_{p} : = \frac{1}{Λ (p, q)} = \frac{\log p - \log q}{p - q}, for p \neq \frac{1}{2}, and λ_{1 / 2} = 2 .

Proof.

The starting point is the splitting obtained from (13). The variances and mean-difference in (13) can be estimated in the same way as in the proof (6) of Theorem 1. Additionally, the fact [6] that

LSI (α)

implies

PI (α)

is used to derive for any

η > 0

and any

ϑ \in (0, 1)

\begin{matrix} {Ent}_{μ_{p}} [f^{2}] & \leq \frac{1}{α_{0}} (1 + q λ_{p} (1 + (1 + η) ϑ^{2} χ_{0})) \int {| \nabla f |}^{2} p d μ_{0} \\ + \frac{1}{α_{1}} (1 + p λ_{p} (1 + (1 + η^{- 1}) {(1 - ϑ)}^{2} χ_{1})) \int {| \nabla f |}^{2} q d μ_{1} \\ \leq \max {\frac{1 + q λ_{p} (1 + (1 + η) ϑ^{2} χ_{0})}{α_{0}}, \frac{1 + p λ_{p} (1 + (1 + η^{- 1}) {(1 - ϑ)}^{2} χ_{1})}{α_{1}}} \int {| \nabla f |}^{2} d μ_{p} . \end{matrix}

(15)

By introducing reduced log–Sobolev constants

{\tilde{α}}_{0} : = \frac{α_{0}}{1 + q λ_{p}} and {\tilde{α}}_{1} : = \frac{α_{1}}{1 + p λ_{p}},

(16)

as well as defining the constants

{\tilde{χ}}_{0}

and

{\tilde{χ}}_{1}

by

{\tilde{χ}}_{0} : = \frac{χ_{0} λ_{p}}{1 + q λ_{p}} and {\tilde{χ}}_{1} = \frac{χ_{1} λ_{p}}{1 + p λ_{p}},

(17)

the bound (15) takes the form

{Ent}_{μ_{p}} (f^{2}) \leq \max {\frac{1 + (1 + η) ϑ^{2} {\tilde{χ}}_{0}}{{\tilde{α}}_{0}}, \frac{1 + (1 + \frac{1}{η}) {(1 - ϑ)}^{2} {\tilde{χ}}_{1}}{{\tilde{α}}_{1}}} \int {| \nabla f |}^{2} d μ_{p} .

(18)

The estimate (18) has the same structure as the estimate (6), where

{\tilde{α}}_{0}

,

{\tilde{α}}_{i}

play the role of

ϱ_{0}

,

ϱ_{1}

and

{\tilde{χ}}_{0}

,

{\tilde{χ}}_{1}

the roles of

χ_{0}

,

χ_{1}

. Hence, the optimization procedure from the proof of Theorem 1 applies to this case and the last step consists of translating the constants

{\tilde{α}}_{0}

,

{\tilde{α}}_{1}

and

{\tilde{χ}}_{0}

,

{\tilde{χ}}_{1}

back to the original ones. □

Remark 3.

Let the bound for

\frac{1}{α_{p}}

in the last case of (14) be denoted by

\frac{1}{A_{p}}

. Then, the proof shows that it can be bounded above and below in the same way as in (7) in terms of the reduced constants (16) and (17)

\max {\frac{1 + q λ_{p}}{α_{0}}, \frac{1 + p λ_{p}}{α_{1}}} \leq \frac{1}{A_{p}} \leq \max {\frac{1 + q λ_{p} (1 + χ_{0})}{α_{0}}, \frac{1 + p λ_{p} (1 + χ_{1})}{α_{1}}} .

In the case

χ_{0} = χ_{1} = χ

, the simplified bound

\frac{1}{α_{p}} \leq \frac{1 + λ_{p} + p q λ_{p} χ}{p α_{0} + q α_{1}}

(19)

holds. The inverse logarithmic mean

λ_{p} = \frac{1}{Λ (p, q)}

blows up logarithmically for

p \to 0, 1

. Hence, even in the case

χ = 0

, the bound (19) diverges logarithmically. This logarithmic divergence looks at first sight artificial, especially in comparison to (8) showing that the Poincaré constant is bounded. However, the next section with examples shows that this blow-up may actually occur. Hence, the bound in (14) is actually optimal on this level of generality.

An analogue statement as Corollary 1 for the Poincaré constant is obtained for the lob-Sobolev constant, whose proof follows along the same lines and is omitted.

Corollary 2.

Let

μ_{0} ≪ μ_{1}

and

μ_{0}

,

μ_{1}

satisfy

LSI (α_{0})

and

LSI (α_{1})

, respectively. Then, for any

p \in (0, 1)

and

p = 1 - q

, the mixture measure

μ_{p} = p μ_{0} + q μ_{1}

satisfies

LSI (α_{p})

with

\frac{1}{α_{p}} \leq \max {\frac{1 + q λ_{p}}{α_{0}}, \frac{1 + p λ_{p} (1 + χ_{1})}{α_{1}}} .

Likewise, if

μ_{1} ≪ μ_{0}

, then

\frac{1}{α_{p}} \leq \max {\frac{1 + p λ_{p}}{α_{1}}, \frac{1 + q λ_{p} (1 + χ_{0})}{α_{0}}}

holds.

4. Examples

The results of Theorems 1 and 2 are illustrated for some specific examples and also compared to the results ([1], Section 4.5), which however are restricted to one-dimensional measures. Although the criterion of Theorems 1 and 2 can only give upper bounds for the multidimensional case, when at least one of the mixture component is absolutely continuous to the other, it is still possible to obtain the optimal results in terms of scaling in the mixture parameter

p \to 0, 1

.

4.1. Mixture of Two Gaussian Measures with Equal Covariance Matrix

Let us consider the mixtures of two Gaussians

μ_{0} : = N (0, Σ)

and

μ_{1} : = N (y, Σ)

, for some

y \in R^{n}

and a strictly positive definite covariance matrix

Σ \geq σ Id

in the sense of quadratic forms for some

σ > 0

. Then,

μ_{0}

and

μ_{1}

satisfy

PI (σ^{- 1})

and

LSI (σ^{- 1})

by the Bakry–Émery criterion (Theorem A1), i.e.,

ϱ_{0} = α_{0} = ϱ_{1} = α_{1} = σ^{- 1}

. Furthermore, the

χ^{2}

-distance between

μ_{0}

and

μ_{1}

can be explicitly calculated as a Gaussian integral (see also [7])

\begin{matrix} χ_{0} = χ_{1} & = \frac{1}{{(2 π)}^{\frac{n}{2}} \sqrt{\det Σ}} \int \exp (- x \cdot Σ^{- 1} x + \frac{1}{2} (x - y) \cdot Σ^{- 1} (x - y)) d x - 1 \\ = \exp (y \cdot Σ^{- 1} y) \frac{1}{{(2 π)}^{\frac{n}{2}} \sqrt{\det Σ}} \int \exp (- \frac{1}{2} (x + y) Σ^{- 1} (x + y)) d x - 1 \leq e^{{| y |}^{2} / σ} - 1 . \end{matrix}

Then, the bound from Theorem 1 in the form (8) yields

\frac{1}{ϱ_{p}} \leq (1 + p q (e^{{| y |}^{2} / σ} - 1)) σ .

(20)

Likewise, the log–Sobolev constant follows from Theorem 2 in the form (19) leads to

\frac{1}{α_{p}} \leq (1 + p q λ_{p} (e^{{| y |}^{2} / σ} + 1)) σ .

By noting that

p q \leq p q λ_{p} \leq \frac{1}{4}

, both constants stay uniformly bounded in p. The large exponential factor in the distance

e^{{| y |}^{2} / σ}

cannot be avoided on this level of generality since the mixed measure

μ_{p}

has a bimodal structure leading to metastable effects ([2], Remark 2.20).

The result ([1] Corollary 4.7) deduced the following bound on

\frac{1}{ϱ_{p}}

for the mixture of two one-dimensional standard Gaussians

σ = 1

in (20)

\frac{1}{ϱ_{p}} \leq 1 + {p q | y |}^{2} (Φ (| y |) e^{{| y |}^{2}} + \frac{| y |}{\sqrt{2 π}} e^{{| y |}^{2} / 2} + \frac{1}{2}),

(21)

where

Φ (a) = \frac{1}{\sqrt{2 π}} \int_{- \infty}^{a} e^{- y^{2} / 2} d y

. The elementary inequalities

e^{a^{2}} - 1 \leq a^{2} e^{a^{2}}

and

Φ (a) \geq 1 + \frac{a}{\sqrt{2 π}} e^{- a^{2} / 2}

for all

a \in R

show that the bound (20) is better than the bound (21) for all parameter values

p \in [0, 1]

and

| y | \geq 0

.

Hence, this example shows that, for mixtures with components that are absolutely continuous to each other as well as whose tail behavior is controlled in terms of the

χ^{2}

-distance, Theorems 1 and 2 even improve the bound of [1] and generalize it to the multidimensional case.

4.2. Mixture of a Gaussian and Sub-Gaussian Measure

Let us consider

μ_{1} = N (0, Σ)

where

Σ \geq σ Id

is strictly positive definite. In addition, let the density of

μ_{0}

with respect to

μ_{1}

be bounded uniformly by some

κ \geq 1

, that is the relative density satisfies

d μ_{0} / d μ_{1} \leq κ

almost everywhere on

R^{n}

. By the Bakry–Émery criterion (Theorem A1),

ϱ_{1} = α_{1} = \frac{1}{σ}

holds. Furthermore, an upper bound for

χ_{1}

is obtained by the assumption on the bound on the relative density

χ_{1} = {Var}_{μ_{1}} [\frac{μ_{0}}{μ_{1}}] = \int {(\frac{μ_{0}}{μ_{1}})}^{2} d μ_{1} - 1 \leq κ^{2} - 1 .

Provided that

μ_{0}

satisfies

PI (ϱ_{0})

, the Poincaré constant of the mixture

μ_{p} = p μ_{0} + q μ_{1}

satisfies by Corollary 1 the estimate

\frac{1}{ϱ_{p}} \leq \max {\frac{1}{ϱ_{0}}, (1 + p (κ^{2} - 1)) σ} .

Similarly, Corollary 2 provides whenever

μ_{0}

satisfies

LSI (α_{0})

the following bound for the log–Sobolev constant of the mixture measure

μ_{p}

\frac{1}{α_{p}} \leq \max {\frac{1 + q λ_{p}}{α_{0}}, (1 + p λ_{p} κ^{2}) σ} .

In this case, the logarithmic blow-up of the log–Sobolev constant cannot be ruled out for

p \to 0, 1

, without any further information on

μ_{0}

.

4.3. Mixture of Two Centered Gaussians with Different Variance

For

μ_{0} = N (0, Id)

and

μ_{1} = N (0, σ Id)

, the Bakry–Émery criterion (Theorem A1) implies

ϱ_{0} = α_{0} = 1

and

ϱ_{1} = α_{1} = σ^{- 1}

. The calculation of the

χ^{2}

-distance can be done using the spherical symmetry and is reduced to the one dimensional integral

χ_{0} = \int \frac{d μ_{1}}{d μ_{0}} d μ_{1} - 1 = \frac{H^{n - 1} (\partial B_{1})}{{(2 π)}^{\frac{n}{2}} σ^{n}} \int_{R^{+}} r^{n - 1} e^{- (\frac{1}{σ} - \frac{1}{2}) r^{2}} d r - 1 .

Hereby,

H^{n - 1} (S^{n - 1})

denotes the

n - 1

-dimensional Hausdorff measure of the sphere

\partial B_{1} = x \in R^{n} : | x | = 1

. The integral does only exist for

σ < 2

. In this case, it can be evaluated and simplified. The bound for the constant

χ_{1}

follows by duality under the substitution

σ \mapsto σ^{- 1}

and is given by

χ_{0} = \{\begin{matrix} \frac{1}{{(σ (2 - σ))}^{\frac{n}{2}}} - 1, & σ < 2, \\ + \infty, & σ \geq 2, \end{matrix} and χ_{1} = \{\begin{matrix} \frac{1}{{(σ^{- 1} (2 - σ^{- 1}))}^{\frac{n}{2}}} - 1, & σ > \frac{1}{2}, \\ + \infty, & σ \leq \frac{1}{2} . \end{matrix}

(22)

If

σ \leq 1 / 2

, that is for

χ_{1} = \infty

, the bound given in Corollary 1 yields

\frac{1}{ϱ_{p}} \leq \max {σ, 1 + q χ_{0}} = \max {σ, (1 - q) + \frac{q}{{(σ (2 - σ))}^{\frac{n}{2}}}} = p + \frac{q}{{(σ (2 - σ))}^{\frac{n}{2}}} .

Similarly, if

σ \geq 2

, that is, for

χ_{0} = \infty

, the bound becomes

\frac{1}{ϱ_{p}} \leq \max {1, (1 + p χ_{1}) σ} \leq σ (q + \frac{p}{{(σ^{- 1} (2 - σ^{- 1}))}^{\frac{n}{2}}}) .

In the case

\frac{1}{2} < σ < 2

, the interpolation bound (4) of Theorem 1 could be applied. However, the scaling behavior for the Poincaré constant can already be observed with the estimate (7) in Remark 2, where again, thanks to the symmetry

σ \mapsto \frac{1}{σ}

,

\frac{1}{ϱ_{p}} \leq {\begin{cases} p + \frac{q}{{(σ (2 - σ))}^{\frac{n}{2}}}, & for σ \leq 1, \\ σ (q + \frac{p}{{(σ^{- 1} (2 - σ^{- 1}))}^{\frac{n}{2}}}), & for σ \geq 1, \end{cases}

(23)

holds. Hence, the Poincaré constant stays bounded for the full range of parameter

p \in [0, 1]

and

σ > 0

.

In the case for the log–Sobolev constant, the bound from Corollary 2 gives

\frac{1}{α_{p}} \leq \{\begin{matrix} 1 + \frac{q λ_{p}}{{(σ (2 - σ))}^{\frac{n}{2}}}, & σ \leq 1, \\ σ (1 + \frac{p λ_{p}}{{(σ^{- 1} (2 - σ^{- 1}))}^{\frac{n}{2}}}), & σ \geq 1 . \end{matrix}

(24)

The bound (24) blows up logarithmically for

p \to 0, 1

in general. However, the special case

σ = 1

, although trivially, allows for the combined bound

\frac{1}{α_{p}} \leq 1 + \min p, q λ_{p}

, which stays bounded. This behavior can be extended to the range

σ \in (\frac{1}{2}, 2)

thanks to (22) and the interpolation bound of Theorem 2.

The result (23) can be compared with the one of ([1], Section 4.5.2), which states that, for some

C > 0

, all

σ > 1

and

p \in (0, 1 / 2),

\frac{1}{ϱ_{p, CM}} \leq σ + C p^{\frac{1}{σ - 1}}

(25)

holds. In general, depending on the constant C, the bound (23) is better for

σ

small, whereas the scaling in

σ

is better for (25), namely linear instead of

σ^{\frac{3}{2}}

as in (20).

4.4. Mixture of Uniform and Gaussian Measure

Let

μ_{0} = N (0, Id)

and

μ_{1} = \frac{1}{H^{n} (B_{1})} 1_{B_{1}}

with

B_{1}

the unit ball around zero. Then,

ϱ_{0} = 1

holds by the Bakry–Émergy criterion (Theorem A1) and

ϱ_{1} \geq \frac{π^{2}}{diam {(B_{1})}^{2}} = \frac{π^{2}}{4}

by the result of [8]. Furthermore, since

μ_{1} ≪ μ_{0}

, the

χ^{2}

-distance between

μ_{0}

and

μ_{1}

becomes thanks to the spherical symmetry

χ_{0} + 1 = \int {(\frac{μ_{1}}{μ_{0}})}^{2} d μ_{0} = \frac{{(2 π)}^{\frac{n}{2}}}{H^{n} {(B_{1})}^{2}} \int_{B_{1}} e^{| x | / 2} d x = \frac{{(2 π)}^{\frac{n}{2}} H^{n - 1} (\partial B_{1})}{H^{n} {(B_{1})}^{2}} \int_{0}^{1} r^{n - 1} e^{r^{2} / 2} d r .

(26)

The volume

H^{n} (B_{1})

and the surface area

H^{n - 1} (\partial B_{1})

of the n-sphere satisfy the following relations

\frac{H^{n - 1} (\partial B_{1})}{H^{n} (B_{1})} = n and \frac{{(2 π)}^{\frac{n}{2}}}{H^{n} (B_{1})} = 2^{\frac{n}{2}} Γ (\frac{n}{2} + 1) = : g_{n} .

(27)

The integral on the right-hand side in (26) can be bounded below by

\frac{1}{n}

and above by

\frac{\sqrt{e}}{n}

, which altogether yields

g_{n} \leq χ_{0} + 1 \leq \sqrt{e} g_{n} .

Corollary 1 implies that the Poincaré constant of the mixture

μ_{p} = p μ_{0} + q μ_{1}

satisfies

\frac{1}{ϱ_{p}} \leq \max {\frac{1}{ϱ_{1}}, 1 + q χ_{0}} \leq p + q \sqrt{e} g_{n},

(28)

where the last inequality follows from

\frac{4}{π^{2}} \leq p + q \sqrt{e} g_{n}

for

n \geq 1

and all

p \in [0, 1]

.

The estimate of the log–Sobolev constant uses

α_{0} = 1

by the Bakry–Émergy criterion (Theorem A1) and

α_{1} \geq \frac{2}{e}

from (A1). Then, Corollary 2 yields the bound

\begin{matrix} \frac{1}{α_{p}} \leq \max {\frac{1 + p λ_{p}}{α_{1}}, \frac{1 + q λ_{p} (1 + χ_{0})}{α_{0}}} \leq \max {\frac{(1 + p λ_{p}) e}{2}, 1 + q λ_{p} \sqrt{e} g_{n}} . \end{matrix}

(29)

There is a logarithmically blow-up of the bound for

p \to 0, 1

.

The blow-up for

p \to 1

is artificial, which can be shown by a combination Bakry–Émery criterion and the Holley–Stroock perturbation principle. To do so, the Hamiltonian of

μ_{p}

is decomposed into a convex function and some error term

\begin{matrix} H_{p} (x) & : = - \log μ_{p} (x) = - \log (\frac{p}{{(2 π)}^{\frac{n}{2}}} e^{- \frac{{| x |}^{2}}{2}} + \frac{1 - p}{H^{n} (B_{1})} 𝟙_{B_{1} (0)} (x)) \\ = - \log (e^{- \frac{{| x |}^{2}}{2} + \frac{1}{2}} + \frac{1 - p}{p} \frac{{(2 π)}^{\frac{n}{2}}}{H^{n} (B_{1})} \sqrt{e} 𝟙_{B_{1} (0)} (x)) + C_{p, n} \\ = \frac{{| x |}^{2} - 1}{2} - ψ_{p} (x) + {\tilde{C}}_{p, n}, \end{matrix}

(30)

where

ψ_{p} (x) : = (\log (e^{- \frac{{| x |}^{2}}{2} + \frac{1}{2}} + \frac{1 - p}{p} \frac{{(2 π)}^{\frac{n}{2}}}{H^{n} (B_{1})} \sqrt{e}) + \frac{{| x |}^{2} - 1}{2}) 𝟙_{B_{1} (0)} (x) .

The function

ψ_{p}

is radially monotone towards the boundary of

B_{1}

, which yields for

| x | \to 1

the bound

0 \leq ψ_{p} (x) \leq \log (1 + \frac{1 - p}{p} \frac{{(2 π)}^{\frac{n}{2}}}{H^{n} (B_{1})} \sqrt{e}) .

(31)

From (30), the Hamiltonian

H_{p}

is compared with the convex potential

\frac{{| x |}^{2} - 1}{2}

with the bound (31) on the perturbation

ψ_{p}

. This together yields, by the Bakry–Émergy criterion (Theorem A1) and the Holley–Stroock perturbation principle (Theorem A2), the

μ_{p}

satisfies

PI ({\tilde{ϱ}}_{p})

and

LSI ({\tilde{α}}_{p})

with

\frac{1}{{\tilde{ϱ}}_{p}} \leq \frac{1}{{\tilde{α}}_{p}} \leq 1 + \frac{1 - p}{p} \sqrt{e} g_{n},

(32)

where

g_{n}

is the same constant as in (27). This bound only blows up for

p \to 0

. However, the blow-up is like

\frac{1}{p}

. Furthermore, the bound on the Poincaré constant is worse than the one from (28). Therefore, both approaches need to be combined.

The combination of the bounds obtained in (29) and (32) results in the improved bound

\frac{1}{α} \leq C_{n} (1 + q λ_{p} g_{n}), with C_{n} some universal constant,

(33)

which only logarithmically blows up for

p \to 0

.

This example shows that the Poincaré constant and log–Sobolev constant may have different scaling behavior for

p \to 0

. Indeed, Ref. [1] shows, for this specific mixture in the one-dimensional case that the log–Sobolev constant can be bounded below by

C | \log p | \leq \frac{1}{α},

for p small enough and a constant C independent of p. In one dimension, lower bounds are accessible via the functional introduced by Bobkov–Götze [9]. Hence, the bound (33) is optimal in the one-dimensional case, which strongly indicates also optimality for the higher dimension case in terms of scaling in the mixture ration p.

To conclude, the Bakry–Émery criterion in combination with the Holley–Stroock perturbation principle is effective for detecting blow-ups of the log–Sobolev constant for mixtures, but has, in general, the wrong scaling behavior in the mixing parameter p. On the other hand, the criterion presented in Theorem 2 provides the right scaling of the blow-up but may give artificial blow-ups, if the components of the mixture become singular in the sense of the

χ^{2}

-distance.

5. Conclusions

Recently, the investigation of mixtures can be found in many different applications, and the main results of this work may be useful to the investigation of asymmetric Kalman filter estimates [10], the study of asymmetric mixtures in Marine Biology [11], Econometrics [12], Gradient-quadratic and fixed-point iteration algorithms [7] and estimates of multivariate Gaussian mixtures [13].

Theorems 1 and 2 provide a simple estimate of the Poincaré and log–Sobolev constants of a two-component mixture measure

μ_{p} = p μ_{0} + q μ_{1}

if the

χ^{2}

-distance of

μ_{0}

and

μ_{1}

is bounded and each of the components satisfies a Poincaré or log–Sobolev inequality. Section 4 reviews several examples with the following findings:

For mixtures with components that are mutually absolutely continuous and whose tail behavior is mutually controlled in terms of the $χ^{2}$ -distance, Theorems 1 and 2 are very effective.
If only one of the components is absolutely continuous to the other one with bounded density, then it is still possible to obtain a bound on the Poincaré and log–Sobolev constant. However, the log–Sobolev constant blows up logarithmically in the mixture parameter p approaching 0 or 1. It is shown for specific examples that this blow-up is at least for one limit $p \to 0$ or $p \to 1$ not artificial due to the applied method.
A necessary condition for the finiteness of the $χ^{2}$ -distance between two measures is that at least one of the measures $μ_{0}$ and $μ_{1}$ is absolutely continuous to the other one, which in particular provides a mixture with connected support. This condition is too strong since one can easily decompose a measure into a mixture, where the joint support of the components is a null set. In this case, the present approach would not be helpful, even though the mixture may still satisfy both functional inequalities.

Future work could overcome the limits of the present approach by revisiting the crucial ingredient for both the Poincaré and log–Sobolev inequality, which was the representation of the mean-difference in Lemma 1 regarding covariances. Formula (3) from Lemma 1 applies only in the case where both measures are mutually absolutely continuous. However, the idea of an interpolation bound can be generalized to suitable weighted Sobolev spaces. For this, since

μ_{0}, μ_{1} ≪ μ_{p}

for all

p \in (0, 1)

, one can formally write and estimate

E_{μ_{0}} [f] - E_{μ_{1}} [f] = {Cov}_{μ_{p}} [f, \frac{d^{} μ_{0}}{d μ_{p}^{}} - \frac{d^{} μ_{1}}{d μ_{p}^{}}] \leq {‖ f ‖}_{{\dot{H}}^{1} (μ_{p})} {‖ \frac{d^{} μ_{0}}{d μ_{p}^{}} - \frac{d^{} μ_{1}}{d μ_{p}^{}} ‖}_{{\dot{H}}^{- 1} (μ_{p})} .

(34)

Hereby,

{\dot{H}}^{1} (μ_{p})

is the homogeneous weighted

{\dot{H}}^{1}

space with norm

f_{H^{1} (μ_{p})}^{2} : = \int {| \nabla f |}^{2} d μ_{p}

and

{\dot{H}}^{- 1} (μ_{p})

is its dual space with norm

{‖ ω ‖}_{{\dot{H}}^{- 1} (μ_{p})}^{2} : = \sup_{f \in {\dot{H}}^{1} (μ_{p})} {2 {〈 f, ω 〉}_{μ_{p}} - {‖ f ‖}_{H^{1} (μ_{p})}^{2}} .

The representation (34) is fruitful to many more applications in which the components of the mixture do not need to be absolutely continuous. Similar ideas for estimating mean-differences were successfully applied in the metastable setting [2,14], in which suitable bounds on the

{\dot{H}}^{- 1}

-norm are obtained. In this regard, the bound (34) promises many interesting new insights for future studies.

Funding

This research received no external funding.

Acknowledgments

This work is based on part of the Ph.D. thesis [15] written under the supervision of Stephan Luckhaus at the University of Leipzig. The author thanks the Max-Planck-Institute for Mathematics in the Sciences in Leipzig for providing excellent working conditions. The author thanks Georg Menz for many discussions on mixtures and metastability.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. Bakry–Émery Criterion and Holley–Stroock Perturbation Principle

Two classical conditions for Poincaré and log–Sobolev inequalities are stated in this part of the appendix. The Bakry–Émery criterion relates the convexity of the Hamiltonian of a measure and positive curvature of the underlying space to constants for the Poincaré and log–Sobolev inequalities. Although the result is classical for the case of

R^{n}

, the result for general convex domain was established in ([16], Theorem 2.1).

Theorem A1

(Bakry–Émery criterion ([17] Proposition 3, Corollary 2), ([16], Theorem 2.1)). Let

Ω \subset R^{n}

be convex and let

H : Ω \to R

be a Hamiltonian with Gibbs measure

μ (d x) = Z_{μ}^{- 1} e^{- H (x)} 𝟙_{Ω} (x) d x

and assume that

\nabla^{2} H (x) \geq κ > 0

for all

x \in supp μ

. Then, μ satisfies

PI (ϱ)

and

LSI (α)

with

ϱ \geq κ and α \geq κ .

The second condition is the Holley–Stroock perturbation principle, which allows to show Poincaré and log–Sobolev inequalities for a very large class of measures.

Theorem A2

(Holley–Stroock perturbation principle ([18], p. 1184)). Let

Ω \subset R^{n}

and

H : Ω \to R

and

ψ : Ω \to R^{n}

be a bounded function. Let μ and

\tilde{μ}

be the Gibbs measures with Hamiltonian H and

H + ψ

, respectively

μ (d x) = \frac{1}{Z_{μ}} e^{- H (x)} 𝟙_{Ω} (x) d x and \tilde{μ} (d x) = \frac{1}{Z_{\tilde{μ}}} e^{- H (x) - ψ (x)} 𝟙_{Ω} (x) d x .

Then, if μ satisfies

PI (ϱ)

and

LSI (α)

, then

\tilde{μ}

satisfies

PI (\tilde{ϱ})

and

LSI (\tilde{α})

, respectively. Hereby, the constants satisfy

\tilde{ϱ} \geq e^{- {osc}_{Ω} ψ} ϱ and \tilde{α} \geq e^{- {osc}_{Ω} ψ} α,

where

{osc}_{Ω} ψ : = \sup_{Ω} ψ - \inf_{Ω} ψ

.

Proofs relying on semigroup theory of Theorems A1 and A2 can be found in the exposition by Ledoux ([6], Corollary 1.4, Corollary 1.6 and Lemma 1.2).

Example A1 (Uniform measure on the ball).

The measure

μ_{1} = \frac{1}{H^{n} (B_{1})} 𝟙_{B_{1}}

, with

B_{1}

is the unit ball around zero, satisfies

LSI (α_{1})

with

α_{1} \geq \frac{2}{e} .

(A1)

The proof compares the measure

μ_{1}

with a family of measures

ν_{σ} (d x) = \frac{1}{Z_{σ}} \exp (- {σ | x |}^{2} + \frac{σ}{2}) 𝟙 (x) d x for σ > 0 .

Then, it holds that

ν_{σ}

satisfies

LSI (2 σ)

by the Bakry–Émery criterion (Theorem A1). Moreover, it holds that

{osc}_{x \in B_{1}} | - {σ | x |}^{2} + σ / 2 | = \frac{σ}{2}

and hence

μ_{1}

satisfies

LSI (2 σ e^{- σ})

by the Holley–Stroock perturbation principle (Theorem A2) for all

σ > 0

. Optimizing the expression

2 σ e^{- σ}

in σ gives the bound (A1).

References

Chafaï, D.; Malrieu, F. On fine properties of mixtures with respect to concentration of measure and Sobolev type inequalities. Annales de l’Institut Henri Poincaré Probabilités et Statistiques 2010, 46, 72–96. [Google Scholar] [CrossRef]
Menz, G.; Schlichting, A. Poincaré and logarithmic Sobolev inequalities by decomposition of the energy landscape. Ann. Probab. 2014, 42, 1809–1884. [Google Scholar] [CrossRef]
Gibbs, A.L.; Su, F.E. On Choosing and Bounding Probability Metrics. Int. Stat. Rev. 2002, 70, 419–435. [Google Scholar] [CrossRef]
Higuchi, Y.; Yoshida, N. Analytic Conditions and Phase Transition for Ising Models. Unpublished lecture notes in Japanese. 1995. [Google Scholar]
Diaconis, P.; Saloff-Coste, L. Logarithmic Sobolev inequalities for finite Markov chains. Ann. Appl. Probab. 1996, 6, 695–750. [Google Scholar] [CrossRef]
Ledoux, M. Logarithmic Sobolev Inequalities for Unbounded Spin Systems Revisited; Séminaire de Probabilités XXXV; Springer: Berlin, Germany, 1999; pp. 167–194. [Google Scholar] [CrossRef]
Carreira-Perpinan, M.A. Mode-finding for mixtures of Gaussian distributions. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 1318–1323. [Google Scholar] [CrossRef]
Payne, L.E.; Weinberger, H.F. An optimal Poincaré inequality for convex domains. Arch. Ration. Mech. Anal. 1960, 5, 286–292. [Google Scholar] [CrossRef]
Bobkov, S.G.; Götze, F. Exponential Integrability and Transportation Cost Related to Logarithmic Sobolev Inequalities. J. Funct. Anal. 1999, 163, 1–28. [Google Scholar] [CrossRef]
Nurminen, H.; Ardeshiri, T.; Piche, R.; Gustafsson, F. Skew-t Filter and Smoother with Improved Covariance Matrix Approximation. IEEE Trans. Signal Process. 2018, 66, 5618–5633. [Google Scholar] [CrossRef]
Contreras-Reyes, J.; López Quintero, F.; Yáñez, A. Towards Age Determination of Southern King Crab (Lithodes santolla) Off Southern Chile Using Flexible Mixture Modeling. J. Mar. Sci. Eng. 2018, 6, 157. [Google Scholar] [CrossRef]
Tasche, D. Exact Fit of Simple Finite Mixture Models. J. Risk Financ. Manag. 2014, 7, 150–164. [Google Scholar] [CrossRef]
McLachlan, G.; Peel, D. Finite Mixture Models; Wiley Series in Probability and Statistics; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2000. [Google Scholar] [CrossRef]
Schlichting, A.; Slowik, M. Poincaré and logarithmic Sobolev constants for metastable Markov chains via capacitary inequalities. arXiv, 2017; arXiv:1705.05135. [Google Scholar]
Schlichting, A. The Eyring-Kramers Formula for Poincaré and Logarithmic Sobolev Inequalities. Ph.D. Thesis, Universität Leipzig, Leipzig, Germany, 2012. [Google Scholar]
Kolesnikov, A.V.; Milman, E. Riemannian metrics on convex sets with applications to Poincaré and log–Sobolev inequalities. Calc. Var. Part. Differ. Equ. 2016, 55, 1–36. [Google Scholar] [CrossRef]
Bakry, D.; Émery, M. Diffusions Hypercontractives; Séminaire de Probabilités, XIX; Springer: Berlin, Germany, 1985; pp. 177–206. [Google Scholar]
Holley, R.; Stroock, D. Logarithmic Sobolev inequalities and stochastic Ising models. J. Stat. Phys. 1987, 46, 1159–1194. [Google Scholar] [CrossRef]

© 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Schlichting, A. Poincaré and Log–Sobolev Inequalities for Mixtures. Entropy 2019, 21, 89. https://doi.org/10.3390/e21010089

AMA Style

Schlichting A. Poincaré and Log–Sobolev Inequalities for Mixtures. Entropy. 2019; 21(1):89. https://doi.org/10.3390/e21010089

Chicago/Turabian Style

Schlichting, André. 2019. "Poincaré and Log–Sobolev Inequalities for Mixtures" Entropy 21, no. 1: 89. https://doi.org/10.3390/e21010089

APA Style

Schlichting, A. (2019). Poincaré and Log–Sobolev Inequalities for Mixtures. Entropy, 21(1), 89. https://doi.org/10.3390/e21010089

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Poincaré and Log–Sobolev Inequalities for Mixtures

Abstract

1. Introduction

2. Poincaré Inequality

3. Log–Sobolev Inequality

4. Examples

4.1. Mixture of Two Gaussian Measures with Equal Covariance Matrix

4.2. Mixture of a Gaussian and Sub-Gaussian Measure

4.3. Mixture of Two Centered Gaussians with Different Variance

4.4. Mixture of Uniform and Gaussian Measure

5. Conclusions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Bakry–Émery Criterion and Holley–Stroock Perturbation Principle

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI